Microsoft Quietly Pulls Its Database of 100,000 Faces Used By Chinese Surveillance Companies

Back in 2016, Microsoft built a database of more than 10 million images featuring roughly 100,000 people. Today, the Financial Times has reported Microsoft quietly deleted this database, dubbed MS Celeb, from the internet.

Before its deletion, MS Celeb was the largest public facial recognition data set in the world. It was purportedly called ‘Celeb’ to imply the faces in the data set were from public figures. The catch is, according to the Financial Times, many people featured in the set were not asked for their consent. Instead, their images were included by scraping image and video searches via the Creative Commons license. (Under the license, you can reuse photos for academic research. And the subject of the photos doesn’t necessarily grant the license, the copyright owner does.) However, the Financial Times found the set contained faces of private citizens and security journalists, including Kim Zetter, Adrien Chen, and Julie Brill, a former FTC commissioner, among others.

“The site was intended for academic purposes,” Microsoft told the Financial Times. “It was run by an employee that is no longer with Microsoft and has since been removed.”

Unfortunately, it’s not quite that simple. The MS Celeb set has been used by several companies, including IBM, Panasonic, Alibaba, Nvidia, and Hitachi. It’s also been used by Sensetime and Megvii. These two companies are suppliers to Chinese officials in Xinjiang, where facial recognition tech and artificial intelligence has been used to track and imprison minority groups like Uighurs and Muslims. Sensetime was valued at more than $4.5 billion as of late 2018, and its SenseTotem and SenseFace systems are used by various Chinese police departments. Megvii recently raised $750 million in series D funding, and its Face++ tech was actually cited in a Human Rights Watch report as a provider to the Integrated Joint Operations Platform—a police app used in Xinjiang. However, the group then amended its report that the Face=++ account in the IJOP code had never been actively used. In a New York Times report, both companies denied direct knowledge of their software being used to racially profile minorities in China.

It’s unclear whether the MS Celeb data set definitively played a role in attempts to racially profile in the Xinjiang program, and if it did, how critical the data set was in developing that technology. However, researchers at MegaPixel contend that Microsoft clearly lost control over who actually used the data set. A chart shows that China topped the list of countries using MS Celeb in dataset citations in both 2018 and 2019.

Microsoft itself has been vocal about its opposition to using such tech as a form of government surveillance. In a December 2018 blog, Microsoft called on companies to create safeguards and for governments to start regulating facial recognition tech. In the blog, it also acknowledged the potential for governments to abuse facial recognition. Earlier in April, Microsoft also reportedly turned down a California law enforcement agency’s request to install facial recognition tech in officers’ cars and body cameras, as doing so would disproportionately impact women and minorities.

However, Microsoft’s objections and good intentions only go so far. The FT noted the MS Celeb data set is still available to any academic institution or company that had previously downloaded it, and it’s still being shared on GitHub, Dropbox, and Baidu Cloud. Gizmodo reached out to Microsoft for comment but did not immediately receive a reply.

[Financial Times]

Microsoft Quietly Pulls Its Database of 100,000 Faces Used By Chinese Surveillance Companies

Sign up for our newsletters

Latest news

Hatchette and Elsevier Sue Google for Using Their Work to Train AI

Scientists Found Gold in the Most Ironic Place Possible

Kalshi Odds in ChatGPT is the Peanut Butter and Chocolate of Things You Don’t Need

George Lucas Is Pro AI, Which Shouldn’t Be a Surprise

Xbox Might Have Way to Win Over PlayStation Fans as Sony Ditches Discs

‘Kong x Godzilla: The Ride’ Immerses You in the Monsterverse

How to Watch France vs Spain Livestream Free from Anywhere

New York Issues the Nation’s First Statewide Moratorium on New Large Data Centers

Latest Reviews

The Best Budget Laptops Under $1,000 for Back to School

Roborock Saros 20 Review: Jack of All Trades, Master of Most

You Know What Your Bathroom Needs? A Smart Mirror With Party Lighting

Narwal Freo Z10 Turbo Review: Midrange Vacuum, High-End Performance

X by Xreal a01+ Review: AR Glasses That Are Light on Your Face (and Wallet)

Razer Blade 16 (2026) Review: A Gaming Laptop You Can Actually Call ‘Portable’

Lenovo IdeaPad Slim 5x Gen 11 Review: Solid ARM at a Budget Price

Nothing Ear 3a Review: You Can Skip the Flagship

Related Articles

Microsoft Quietly Pulls Its Database of 100,000 Faces Used By Chinese Surveillance Companies

Sign up for our newsletters

Hatchette and Elsevier Sue Google for Using Their Work to Train AI

Scientists Found Gold in the Most Ironic Place Possible

Kalshi Odds in ChatGPT is the Peanut Butter and Chocolate of Things You Don’t Need

George Lucas Is Pro AI, Which Shouldn’t Be a Surprise

Xbox Might Have Way to Win Over PlayStation Fans as Sony Ditches Discs

‘Kong x Godzilla: The Ride’ Immerses You in the Monsterverse

How to Watch France vs Spain Livestream Free from Anywhere

New York Issues the Nation’s First Statewide Moratorium on New Large Data Centers

The Best Budget Laptops Under $1,000 for Back to School

Roborock Saros 20 Review: Jack of All Trades, Master of Most

You Know What Your Bathroom Needs? A Smart Mirror With Party Lighting

Narwal Freo Z10 Turbo Review: Midrange Vacuum, High-End Performance

X by Xreal a01+ Review: AR Glasses That Are Light on Your Face (and Wallet)

Razer Blade 16 (2026) Review: A Gaming Laptop You Can Actually Call ‘Portable’

Lenovo IdeaPad Slim 5x Gen 11 Review: Solid ARM at a Budget Price

Nothing Ear 3a Review: You Can Skip the Flagship

Related Articles

The Best Budget Laptops Under $1,000 for Back to School

The Best Tech to Level Up Summer 2026

Claude and ChatGPT Are Getting Too Expensive, Even for Microsoft

Xbox Hits ‘Reset’ Button With Thousands of Job Cuts and Game Studio Spin Offs

Everyone Wants to Build AI Using Someone Else’s Work

Microsoft’s Revised Surface Laptop Is Cheaper—and Worse—Than Before