Twitter's Testing Another New Message Filter to Weed Out Abuse in Your DMs

Image: Matt Rourke (AP)

In Twitter’s ongoing crusade against the rampant abuse on its platform that it keeps promising to address (no, legit guys) (this time for realsies), the company recently announced it’s testing a new filter to hopefully help keep trash out of your DMs.

“Unwanted messages aren’t fun,” the company tweeted Thursday in what’s perhaps one of the greatest understatements of the year. So very-not-fun, in fact, that last year Twitter implemented its first direct message filter (given the rather on-the-nose title Quality Filter) for accounts open to DMs from anyone. It better delineates those from mutuals versus those from strangers by filing them in separate folders.

Advertisement

With Thursday’s new filter, now in addition to possibly sketchy “Message requests” from people you don’t follow, there are the most definitely sketchy “Additional messages” from people you don’t follow whose messages “may contain offensive content.” The latter aren’t visible until users opt to see them.

Advertisement

How effective Twitter will be at identifying potentially abusive content in these DMs, though, is another question. The platform has historically relied on users flagging material that might violate its terms of service, though the company’s reportedly been making strides in preemptively flagging content via algorithms. In an April blog post, Twitter announced that “38% of abusive content that’s enforced is surfaced proactively to our teams for review instead of relying on reports from people on Twitter.”

While that’s certainly progress, it has to be taken with a grain of salt when you remember the fact that a Twitter executive recently stated that there’s still a large divide between what the platform and its users consider to be offensive. “A lot of what people consider abusive on the service doesn’t actually violate our policies,” Twitter executive Kayvon Beykpour said in June at Recode’s Code Conference.

Advertisement

This discrepancy may explain, to cite one example, why the platform continues to host several prominent white supremacists despite the many users and civil rights groups calling for their removal. So while I’d love to see this new filter successfully weed out abuse in my inbox, given that Twitter seems to have a different definition of abuse than most of its users, excuse me if I don’t get my hopes up.

Updated: A Twitter spokesperson responded to Gizmodo via email to give us the low-down on how things work behind the scenes with this recent round of testing (as well as to add the clarification that, despite the company’s tweet about “testing a filter,” this new experiment is actually an extension of its existing Quality Filter that was originally rolled out last year).

Advertisement

The DMs popping up in this new “Additional messages” were originally filtered out completely since Twitter’s algorithm flagged them as spam or potentially harmful material based on “behavioral signals” of the sender. These signals include whether an account has been verified with a phone number, when it was created, location data, etc. Essentially, all the signs you would think of to spot a bot account.

With this recent experiment, Twitter users can now choose to see for themselves if these messages are actually junk or whether they were incorrectly flagged.

Advertisement

In short, Twitter isn’t judging potentially abusive DMs based on the actual content of the messages, but rather the behavior of the sender. So color me surprised and serve me up a bit of crow while you’re at it. I still don’t have my hopes up—I’ve been on this hellsite far too long for that kind of naive—but I am curious to check my own “Additional messages” folder to see how correct its algorithm was at flagging crap before it reaches (and burns) my eyes. 

Share This Story

About the author

Alyse Stanley

Gizmodo weekend editor. Freelance video game reporter. Full-time disaster bi.