Image uploads are the single biggest source of moderation risk on any chat platform. A bad text message is annoying. A bad image can violate platform policies, scar a member, or trigger legal exposure for the whole site. So we treat image moderation as a hard problem and stage the work across three layers.
Layer 1 — pre-upload sanity checks
Before an image even reaches our servers we check the simple stuff. File type must be a recognised image format. Size has to be under our per-room cap. The uploader has to be a real, signed-in account in good standing. Throttling per account per minute stops the most obvious flood patterns. None of this is fancy, but it stops about 90 percent of low-effort abuse before it costs us a single API call.
Layer 2 — automated content classification
Every accepted image is fed through an automated classifier that returns a structured score across the categories we care about — sexual content, graphic violence, hateful imagery, and so on. The classifier we use is good but not perfect, so we treat its output as a recommendation rather than a verdict. A high-confidence positive on any prohibited category results in an automatic hold and the uploader is notified that the image is pending review. A clean score releases the image immediately into the room.
Layer 3 — human review of edge cases
Anything held by Layer 2 goes into a moderator review queue, and any image that another member reports also lands in that queue. A human moderator decides whether to release, remove, or escalate. Every decision is logged the same way text moderation decisions are logged — with the rule cited, the acting moderator, and the timestamp — and the uploader can appeal exactly like any other moderation action.
Why this stack instead of pure automation
Pure automation produces two predictable failures. It releases things it should not because the classifier missed something obvious to a human eye. And it blocks things it should not because the classifier panicked over context — for example, a medical photo in a health support room. Putting a human at the end of the pipeline does not slow the rooms down for the 99 percent of clean uploads; it slows down only the marginal ones that genuinely need a judgement call.
What we will not store
We do not retain raw image uploads any longer than the room itself does. Held-but-rejected images are purged after the appeal window closes. We do not feed your uploads back into any third-party training set. Our privacy policy covers the specifics, but the short version is: the images you upload exist to be seen by the room you uploaded them to, and that is the only purpose we use them for.
0 Comments