There have been multiple accounts created with the sole purpose of posting advertisement posts or replies containing unsolicited advertising.

Accounts which solely post advertisements, or persistently post them may be terminated.

modulus ,

Perhaps the manual reporting tool is enough? Then that content can be forwarded to the central ms service. I wonder if that API can report back to say whether it is positive.

The problem with a lot of this tooling is you need some sort of accreditation to use it, because it somewhat relies on security through obscurity. As far as I know you can’t just hit MS’s servers and ask “is this CSAM?” If something like that were possible it might work.

Can you elaborate on the hash problem?

Sure. When you have an image, you can do lots of things to it that change it in some way: change the compression, the format, crop it, apply a filter… This all changes the file and so it changes the hash. The perceptual hash system works on the basis of some computer vision stuff and the idea is that it will try to generate the same hash for pictures that are substantially the same. But this tech is imperfect and probably will have changes. So if there’s a change in the way the hash gets calculated, it wouldn’t be enough with keeping hashes, you’d have to keep the original file to recalculate, which is storing CSAM, which is ordinarily not allowed and for good reason.

For a hint on how bad these hashes can get, they are reversible, vulnerable to pre-image attacks, and so on.

Some of this is probably inevitable in this type of systems. You don’t want to make it easy for someone to hit the servers with a large number of hashes, and then use IPFS or BitTorrent DHT to retrieve positives (you’d be helping people getting CSAM). The problem is hard.

Personally I was thinking of generating a federated set based on user reporting. Perhaps enhanced by checking with the central service as mentioned above. This db can then be synced with trusted instances.

Something like that could work, maybe obscuring some of the hash content (random parts of it) so that it doesn’t become a way to actually find the stuff.

Whatever decisions are made have to be well thought through so as not to make the problem worse.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • [email protected]
  • lifeLocal
  • goranko
  • All magazines