Google Is Fighting Internet Trolls With AI

Google subsidiary Jigsaw is developing an AI program that can analyze language patterns to identify online harassment. The technology is called “Conversation AI,” and with an accuracy rate of 92%, its team claims it’s already better than human moderators.

“I want to use the best technology we have at our disposal to begin to take on trolling and other nefarious tactics that give hostile voices disproportionate weight,” says Jigsaw founder and president Jared Cohen in an interview for Wired. “To do everything we can to level the playing field.”

Jigsaw was named after its focus on physical results, if you were wondering. No relation to that serial killer from the ‘Saw’ franchise.

How It Works

Social media, and Twitter in particular, has a problem with online harassment. Users join together on the basis of a shared interest and then brigade individuals over perceived disses. Earlier this year, Reddit tweaked its user blocking rules in an attempt to contain harassment. Periscope took similar measures early in June of this year. The patterns of abuse are easy to point out, and AI has been developed as a potential solution in many cases. Even by Google itself.

Jigsaw’s Conversation AI relies on machine learning, a process in which the AI uses a series of examples to learn what abuse looks like, rather than relying on a pre-programmed matrix. Does it work? From Wired:

“[B]y some measures Jigsaw has now trained Conver­sation AI to spot toxic language with impressive accuracy. Feed a string of text into its Wikipedia harassment-detection engine and it can, with what Google describes as more than 92 percent certainty and a 10 percent false-positive rate, come up with a judgment that matches a human test panel as to whether that line represents an attack.”

The algorithm focuses on just individual strings of text right now, but the team hopes it can eventually process long-term patterns across entire social media profiles.

On the less optimistic side, some have raised concerns about the false-positive rate. Still, it can’t be worse than letting internet trolls go unstopped.

Did you find this article helpful? Click on one of the following buttons
We're so happy you liked! Get more delivered to your inbox just like it.

We're sorry this article didn't help you today – we welcome feedback, so if there's any way you feel we could improve our content, please email us at

Written by:
Adam is a writer at and has worked as a tech writer, blogger and copy editor for more than a decade. He was a Forbes Contributor on the publishing industry, for which he was named a Digital Book World 2018 award finalist. His work has appeared in publications including Popular Mechanics and IDG Connect, and his art history book on 1970s sci-fi, 'Worlds Beyond Time,' is out from Abrams Books in July 2023. In the meantime, he's hunting down the latest news on VPNs, POS systems, and the future of tech.
Explore More See all news
Back to top
close Building a Website? We've tested and rated Wix as the best website builder you can choose – try it yourself for free Try Wix today