The Malware of the Future Will Have AI Superpowers

In the past two years, we’ve learned that machine learning algorithms can manipulate public opinion, cause fatal car crashes, create fake porn, and manifest extremely sexist and racist behavior.

And now, the cybersecurity threats of deep learning and neural networks are emerging. We’re just beginning to catch glimpses of a future in which cybercriminals trick neural networks into making fatal mistakes and use deep learning to hide their malware and find their target among millions of users.

Part of the challenge of securing artificial intelligence applications lies in the fact it’s hard to explain how they work, and even the people who create them are often hard-pressed to make sense of their inner workings. But unless we prepare ourselves for what is to come, we’ll learn to appreciate and react to these threats the hard way.

In the AI crosshairs

In 2010, the United States and Israel purportedly released Stuxnet, a malware aimed at incapacitating Iran’s nuclear infrastructure. Stuxnet was designed to spread like a worm but only unleash its malicious payload if it found itself inside a network configuration identical to that of Iran’s nuclear facility in Natanz. The Stuxnet worm is still one of the most sophisticated viruses ever created, and the highly targeted attack was only made possible thanks to information and resources available to intelligence agencies.

But in the age of AI, creating targeted malware can become as easy as training a neural network with pictures or voice samples of the intended target. In August, IBM researchers presented DeepLocker, a proof-of-concept malware that used deep neural networks to hide its malicious payload, a variation of the WannaCry ransomware, and only activate it when it detected its target.

They embedded DeepLocker in a seemingly innocent video conferencing application. In a hypothetical scenario, the application could be installed and used by millions of users without manifesting any malicious behavior. Meanwhile, however, the malware uses a facial recognition neural network, tuned to the picture of the intended targeted, to scan the computer’s webcam video feed for its target.

As soon as the target’s face shows up in front of the camera of a computer running the infected application, DeepLocker activates its ransomware, encrypting all the files on the victim’s computer.

“While the facial recognition scenario is one example of how malware could leverage AI to identify a target, other identifiers such as voice recognition or geo-location could also be used by an AI-powered malware to find its victim,” Marc Stoecklin, the lead researcher of the project, told me at the time IBM released its findings.

One can easily see how the same model can be applied in other dangerous ways, such as harming or spying on people of a specific gender or race.

A menacing aspect of AI-powered malware such as DeepLocker is that it uses the “black box” nature of deep learning to hide its malicious payload. Security researchers usually discover and document malware by reverse engineering them, activating them in sandbox conditions, and extracting their digital and behavioral signatures. Unfortunately, neural networks are extremely difficult to reverse engineer, which makes it easier for attackers to evade security tools and analysts.

Turning AI against itself

Another growing trend of AI-based threats are adversarial attacks, where malicious actors manipulate input data to force neural networks to act in erratic manners. There are already several published reports and studies that show how these attacks can work in different scenarios.

Most of the work done in the field is focused on tricking computer vision algorithms, the branch of AI that enables computers to classify and detect objects in images and video. This is the technology used in self-driving cars, facial recognition, and smart camera applications such as Google Lens.

But the problem is, we don’t exactly know how the neural networks behind computer vision algorithms define the characteristics of each object, and that’s why they can fail in epic and unexpected ways.

For instance, students at MIT showed that by making minor tweaks to a toy turtle, they could trick computer vision algorithms to classify it as a rifle. In a similar study, researchers at Carnegie Mellon University showed they could fool facial recognition algorithms to mistake them for celebrities by donning special glasses. In yet another case, software used by UK Metropolitan Police to detect child pornography flagged pictures of dunes as nudes.

While these are relatively harmless examples, things can get problematic since neural networks are finding their way into an increasing number of critical settings. For instance, a joint research by the University of Michigan, the University of Washington, and the University of California, Berkeley, found that by sticking small black and white stickers on stop signs, they could make them undetectable to the AI algorithms used in self-driving cars.

Human vision isn’t perfect, and our eyes often fail us. But in none of these cases would a human make the same mistake as AI does. All these studies underline one very important fact: Although computer vision algorithms often perform on par or better than humans at detecting objects, their functionality is significantly different from human vision and we can’t predict their failures until they happen.

Because of the opaque nature of neural networks, it’s extremely difficult to investigate their vulnerabilities, and if malicious actors discover them first, either by chance or trial and error, they’ll have an easier time hiding and exploiting them to force AI applications to fail in critical and harmful ways. This intentional manipulation of AI algorithms is known as adversarial attacks.

Adversarial attacks aren’t limited to computer vision algorithms. For instance, researchers have found that hackers could manipulate audio files in a way that would be undetectable to the human ear but would silently send commands to a voice-enabled device, such as a smart speaker.

Unlocking the black box

To be clear, AI cyberattacks haven’t been commoditized yet. Developing AI malware and adversarial attacks is still very difficult, and they don’t work consistently. But it’s only a matter of time before someone develops the tools that make them widely available. We just need to look at how FakeApp, an application that simplified face-swapping with the use of deep learning, triggered a wave of fake porn videos and rising concerns about the threats of AI-based forgery and fraud.

There are several proposed defenses against adversarial attacks. But even researchers admit that none of the solutions are complete, because they mostly try to address the neural network black box by poking at it from different angles to trigger any possible unpleasant surprise it might withhold.

Meanwhile, AI-based malware has no documented solution yet. According to the researchers who first raised awareness about it, we can’t even know whether there is already AI-powered malware in the wild or not.

A very important component to securing AI is to make it explainable and transparent. This means that neural networks should either be able to explain the steps they take to reach a decision or allow researchers to reverse engineer and retrace those steps.

Creating explainable AI without compromising the performance of neural networks is difficult, but there are already several efforts in the works, including a government-funded project led by DARPA, the Defense Department’s research arm.

Regulations such as the European Union’s GDPR and California’s CCPA require tech companies to be transparent about their data collection and processing practices and be able to explain the automated decisions their applications make. Compliance with these rules should also help immensely toward achieving explainable AI. And if tech companies are worried that they will lose their competitive edge by making the inner workings of their proprietary algorithms understandable, they should consider that nothing will be more damaging to their reputation—and by extension their bottom line—than a security disaster attributed to their AI algorithms.

It took a global election-manipulation crisis in 2016 for the world to wake up to the destructive power of algorithmic manipulation. Since then, AI algorithms have become even more prominent in everything we do. We shouldn’t wait for another security disaster to happen before we decide to address the AI black box problem.