Vote 2020 graphic
Everything you need to know about and expect during
the most important election of our lifetimes

A New Yahoo AI Can Detect Online Abuse, But Automation Isn't the Answer

Image: Shutterstock
Image: Shutterstock

Online harassment is a serious, but tough problem to try and solve. There is a need for a new system that can ease the burden on moderators who have to comb through those reports. As someone who was once responsible for doing that for a local news station (a much smaller venue than, say, Twitter), I can sympathize with the toll seeing all those racist, abusive messages takes on your psyche. It sucks.


A team at Yahoo recently developed an algorithm that claims to be able to automatically identify hateful speech. The tool uses deep learning to detect abusive keywords, punctuation that were typically found in hateful comments, and syntactic clues found in several thousand comments on Yahoo’s websites.

But that was just the start of it. Researchers also used what is called “word embedding,” which is the process of utilizing vectors to map out words and phrases often used in natural language processing for machines. According to the study released online, the vectors were able to predict the next word based on different contexts. So even if words weren’t identified as abusive—like if they weren’t slurs or derogatory language—they could still be processed as such.


According to MIT Technology Review, the algorithm was able to identify abusive message with around 90 percent accuracy.

Additionally, according to Wired, the database will soon be released online on Yahoo Webscope, which would open it up to be used by other experts.

This is an interesting step forward, but automating only goes so far. Most comment systems—including Disqus, which is probably the most widely-used platform—give websites the ability to ban certain words. Comments that use those words are immediately placed in a queue for a moderator to look over and don’t appear online. This isn’t effective, especially since all you have to do is replace a couple letters or distort a word to make it past the filters.

The study does take into account some of these issues. Spotting certain keywords is only part of it. An algorithm can’t detect sarcasm, nor can it keep up with constantly changing internet language and slang. More technically, a machine can have issues detecting hate across multiple sentences. Researchers wrote,

“In the sentence Chuck Hagel will shield Americans from the desert animals bickering. Let them kill each other, good riddance!, the second sentence which actually has the most hateful intensity (them kill each other) is dependent on the successful resolution of them to desert animals which itself requires world knowledge to resolve.”


Social networks have tools that allow users to report abuse or spam, but most of them are useless. I’d have to think most people are aware that there are jerks on the internet (especially after the high-profile abuse of actor Leslie Jones at the very least), but it’s hard to find evidence that the big-name social networks are being proactive or making sweeping changes. Using artificial intelligence to do the work of humans would certainly be welcome for moderators and managers who have to process abuse reports, but it’s only a small part of what can be done to curb abuse.

There still needs to be policies in place that prevent certain acts and behaviors without going over the line into limiting free speech. There needs to be websites and networks that act upon those policies and remain active in the battle. And that’s just the beginning.


[MIT Technology Review, Wired]

Weekend editor and night person at Gizmodo. More space core than human.

Share This Story

Get our newsletter



No, automation isn't the answer. Cutting abusers and tools off from the Internet permanently and absolutely is.