Policing hate speech and offensive content on a platform as large as Facebook is a big challenge. Filtering for nasty words and phrases is simple enough, but in the age of memes it’s a lot more difficult for the website to detect sensitive posts without human input. To make things a bit easier, Facebook is deploying an AI watchdog that can sniff out bad posts all on its own.
The AI, which can sift through an immense amount of data in a very short period of time, is actually capable of reading text that’s been overlaid on an image or video, and it understands several languages. It’s called Rosetta, and Facebook took some time to explain how it works in a new blog post.
The AI uses an algorithm to detect which regions of an image or video likely contain text, then breaks the suspected text into words which it interprets. The algorithm has to be versatile enough to tackle a number of different languages, including languages like Arabic which are written right-to-left.
Rosetta was trained on both human-annotated images as well as artificially generated ones. The company notes that using the manual approach is not scalable when dealing with more and more languages, and therefore it plans to rely solely on “synthetic” generation to help the AI continue to learn and improve.
When policing videos, the AI could grab individual frames of video and apply the same logic, but Facebook says this approach wouldn’t work long-term.
“The naive approach of applying image-based text extraction to every single video frame is not scalable, because of the massive growth of videos on the platform, and would only lead to wasted computational resources,” the company says. “Recently, 3D convolutions have been gaining wide adoption given their ability to model temporal domain in addition to spatial domain. We are beginning to explore ways to apply 3D convolutions for smarter selection of video frames of interest for text extraction.”
Facebook has increasingly leaned on machine learning to help improve the platform, and it looks like content moderation on Facebook will also be the domain of AI before long.