Like many of the world’s best and worst ideas, MIT researchers’ plan to combat AI-generated deepfakes was hatched when one of their number watched their favorite not-news news show.
On the Oct. 25 episode of The Daily Show with Trevor Noah, OpenAI’s Chief Technology Officer Mira Murati talked up AI-generated images. Though she could likely discuss OpenAI’s AI image generator DALL-E 2 in great detail, it wasn’t a very in-depth interview. After all it was put out for all the folks who likely understand little to nothing about AI art. Still, it did offer a few nuggets of thought. Noah asked Murati if there was a way to make sure AI programs don’t lead us to a world “where nothing is real, and everything that’s real, isn’t?”
Last week, researchers at the Massachusetts Institute of Technology said they wanted to answer that question. They devised a relatively simple program that can use data poisoning techniques to essentially disturb pixels within an image to create invisible noise, effectively making AI art generators incapable of generating realistic deepfakes based on the photos they’re fed. Aleksander Madry, a computer professor at MIT, worked with the team of researchers to develop the program and posted their results on Twitter and his lab’s blog.
Using photos of Noah with Daily Show comedian Michael Kosta, they showed how this imperceptible noise in the image disrupts a diffusion model AI image generator from creating a new photo using the original template. The researchers proposed that anybody planning to upload an image to the internet could run their photo through their program, basically immunizing it to AI image generators.
Hadi Salman, a PHD student at MIT whose work revolves around machine learning models, told Gizmodo in a phone interview that the system he helped develop only takes a few seconds to introduce noise into a photo. Higher resolution images work even better, he said, since they include more pixels that can be minutely disturbed.
Google is creating its own AI image generator called Imagen, though few people have been able to put their system through its paces. The company is also working on a generative AI video system. Salman said they haven’t tested their system out on video, but in theory it should still work, though the MIT’s program would have to individually mock up every frame of a video, which could be tens of thousands of frames for any video longer than a few minutes.
Salman said he could imagine a future where companies, even the ones who generate the AI models, could certify that uploaded images are immunized against AI models. Of course, that isn’t much good news for the millions of images already uploaded to the open source library like LAION, but it could potentially make a difference for any image uploaded in the future.
Madry also told Gizmodo via phone that this system, though their data poisoning has worked in many of their tests, is more of a proof of concept than a product release of any kind. The researchers’ program proves that there are ways to defeat deepfakes before they happen.
Companies, he said, need to come to know this technology, and implement it into their own systems to make it even more resistant to tampering. Moreso, the companies would need to make sure that future renditions of their diffusion models, or any other kind of AI image generator, won’t be able to ignore the noise and generate new deepfakes.
“What really should happen moving forward is that all the companies that develop diffusion models should provide capability for healthy, robust immunization,” Madry said.
Other experts in the machine learning field did find some points to critique the MIT researchers.
Florian Tramèr, a computer science professor at ETH Zurich in Switzerland, tweeted that the major difficulty is you essentially get one try to fool all future attempts at creating a deepfake with an image. Tramèr was the co-author of a 2021 paper published by the International Conference on Learning Representations that essentially found that data poisoning, like what the MIT system does with its image noise, won’t stop future systems from finding ways around it. More so, creating these data poisoning systems will create an “arms race” between commercial AI image generators and those trying to prevent deepfakes.
There have been other data poisoning programs meant to deal with AI-based surveillance, such as Fawkes (yes, like the 5th of November), which was developed by researchers at the University of Chicago. Fawkes also distorts pixels in images in such a way that they disrupt companies like Clearview from achieving accurate facial recognition. Other researchers from the University of Melbourne in Australia and University of Peking in China have also analyzed possible systems that can create “unlearnable examples” that AI image generators can’t use.
The problem is, as noted by Fawkes developer Emily Wenger in an interview with MIT Technology Review, programs like Microsoft Azure managed to win out against Fawkes and detect faces despite their adversarial techniques.
Gautam Kamath, a computer science professor at the University of Waterloo in Onatrio, Canada, told Gizmodo in a Zoom interview that in the “cat and mouse game” between those trying to create AI models and those finding ways to defeat them, the people manufacturing new AI systems seem to have the edge since once an image is on the internet, it’s never really going away. Therefore, if an AI system manages to bypass attempts to keep it from being deepfaked, there’s no real way to remedy it.
“It’s possible, if not likely, that in the future we’ll be able to evade whatever defenses you put on that one particular image,” Kamath said. “And once it’s out there, you can’t take it back.”
Of course, there are some AI systems that can detect deepfake videos, and there are ways to train people to detect the small inconsistencies that show a video is being faked. The question is: will there come a time when neither human nor machine can discern if a photo or video has been manipulated?
For Madry and Salman, the answer is in getting the AI companies to play ball. Madry said they are looking to touch base with some of the major AI generator companies to see if they would be interested in facilitating their proposed system, though of course it’s still in early days, and the MIT team’s still working on a public API that would let users immunize their own photos (the code is available here).
In that way, it’s all dependent on the people who make the AI image platforms. While OpenAI’s Murati told Noah in that October episode they have “some guardrails” for their system, further claiming they don’t allow people to generate images based on public figures (which is a rather nebulous term in the age of social media where practically everyone has a public face). The team is also working on more filters that will restrict the system from creating images that contain violent or sexual images.
Back in September, OpenAI announced users could once again upload human faces to their system, but claimed they had built in ways to stop users from showing faces in violent or sexual contexts. It also asked users not to upload images of people without their consent, but it’s a lot to ask of the general internet to make promises without crossing their fingers.
However, that’s not to say other AI generators and the people who made them are as game at moderating the content their users generate. Stability AI, the company behind Stable Diffusion, have shown they’re much more reluctant to introduce any barriers that stop people from creating porn or derivative artwork using its system. While OpenAI has been, ahem, open about trying to stop their system from displaying bias in the images it generates, StabilityAI has kept pretty mum.
Emad Mostaque, the CEO of Stability AI, has argued for a system without government or corporate influence, and has so far fought back against calls to put more restrictions on his AI model. He has said he believes image generation will be “solved in a year” allowing users to create “anything you can dream.” Of course, that’s just the hype talking, but it does show Mostaque isn’t willing to back down from seeing the technology push itself further and further.
Still, the MIT researchers are remaining steady.
“I think there’s a lot of very uncomfortable questions about what is the world when this kind of technology is easily accessible, and again, it’s already easily accessible and will be even more easy for use,” Madry said. “We’re really glad, and we are really excited about this fact that we can now do something about this consensually.”