Google subsidiary Jigsaw is developing an AI program that can analyze language patterns to identify online harassment. The technology is called “Conversation AI,” and with an accuracy rate of 92%, its team claims it's already better than human moderators.
“I want to use the best technology we have at our disposal to begin to take on trolling and other nefarious tactics that give hostile voices disproportionate weight,” says Jigsaw founder and president Jared Cohen in an interview for Wired. “To do everything we can to level the playing field.”
Jigsaw was named after its focus on physical results, if you were wondering. No relation to that serial killer from the ‘Saw' franchise.
How It Works
Social media, and Twitter in particular, has a problem with online harassment. Users join together on the basis of a shared interest and then brigade individuals over perceived disses. Earlier this year, Reddit tweaked its user blocking rules in an attempt to contain harassment. Periscope took similar measures early in June of this year. The patterns of abuse are easy to point out, and AI has been developed as a potential solution in many cases. Even by Google itself.
Jigsaw's Conversation AI relies on machine learning, a process in which the AI uses a series of examples to learn what abuse looks like, rather than relying on a pre-programmed matrix. Does it work? From Wired:
“[B]y some measures Jigsaw has now trained Conversation AI to spot toxic language with impressive accuracy. Feed a string of text into its Wikipedia harassment-detection engine and it can, with what Google describes as more than 92 percent certainty and a 10 percent false-positive rate, come up with a judgment that matches a human test panel as to whether that line represents an attack.”
The algorithm focuses on just individual strings of text right now, but the team hopes it can eventually process long-term patterns across entire social media profiles.
On the less optimistic side, some have raised concerns about the false-positive rate. Still, it can't be worse than letting internet trolls go unstopped.