During a study in mid-2019, a team of Facebook employees found that proposed rules for Instagram’s automated account removal system disproportionately flagged and banned Black users. When the team approached CEO Mark Zuckerberg and his upper echelon of cronies with this alarming information, they were purportedly ignored and told to halt any further research regarding racial bias in the company’s moderation tools, NBC News reported Thursday.
That’s according to eight current and former Facebook employees who spoke with the outlet on the condition of anonymity. Under these proposed rules, Black Instagram users were roughly 50% more likely than white users to see their accounts automatically disabled for ToS infractions like posting hate speech and bullying.
The issue evidently stemmed from an attempt by Facebook, which owns Instagram, to keep its automated moderation systems neutral by creating the algorithmic equivalent of “You know, I don’t really see color.”
The company’s hate speech policy holds disparaging remarks against privileged groups (i.e. white people and men) to the same scrutiny as disparaging remarks against marginalized groups (i.e. Black people and women). In practice, this meant that the company’s proactive content moderation tools detected hate speech directed at white people at a much higher rate than it did hate speech directed at Black people, in large part because it was flagging comments widely considered innocuous. For example, the phrase “white people are trash” isn’t anywhere near as offensive as the phrase “Black people are trash”–and if you disagree, I hate to be the one to tell you this, but you might be a racist.
“The world treats Black people differently from white people,” one employee told NBC. “If we are treating everyone the same way, we are already making choices on the wrong side of history.”
Another employee who posted about the research on an internal forum said the findings indicated that Facebook’s automated tools “disproportionately defend white men.” Per the outlet:
According to a chart posted internally in July 2019 and leaked to NBC News, Facebook proactively took down a higher proportion of hate speech against white people than was reported by users, indicating that users didn’t find it offensive enough to report but Facebook deleted it anyway. In contrast, the same tools took down a lower proportion of hate speech targeting marginalized groups including Black, Jewish and transgender users than was reported by users, indicating that these attacks were considered to be offensive but Facebook’s automated tools weren’t detecting them.
These proposed rules never saw the light of day, as Instagram purportedly ended up implementing a revised version of this automated moderation tool. However, employees told NBC they were barred from testing it for racial bias after it’d been tweaked.
In response to the report, Facebook claimed that the researchers’ original methodology was flawed, though the company didn’t deny that it had issued a moratorium on investigating possible racial bias in its moderation tools. Facebook’s VP of growth and analytics Alex Shultz cited ethics and methodology concerns for the decision in an interview with NBC.
The company added that it’s currently researching better ways to test for racial bias in its products, which falls in line with Facebook’s announcement earlier this week that it’s assembling new teams to study potential racial impacts on its platforms.
“We are actively investigating how to measure and analyze internet products along race and ethnic lines responsibly and in partnership with other companies,” Facebook spokeswoman Carolyn Glanville said in a statement to multiple outlets.
In his interview with NBC, Schultz added that racial bias on Facebook’s platforms is a “very charged topic” but one that the company has “massively increased our investment” to investigate algorithmic bias and understand its effects on moderating hate speech.
Given Facebook’s penchant for hosting racist, transphobic, sexist, and generally god-awful content, the fact that some algorithmic magic behind the scenes might be helping to stamp out marginalized voices is hardly surprising. Disappointing, for sure, but not surprising.