As covid-19 disrupted the world in March, online retail giant Amazon struggled to respond to the sudden shift caused by the pandemic. Household items like bottled water and toilet paper, which never ran out of stock, suddenly became in short supply. One- and two-day deliveries were delayed for several days. Though Amazon CEO Jeff Bezos would go on to make $24 billion during the pandemic, initially, the company struggled with adjusting its logistics, transportation, supply chain, purchasing, and third-party seller processes to prioritize stocking and delivering higher-priority items.
Under normal circumstances, Amazon’s complicated logistics are mostly handled by artificial intelligence algorithms. Honed on billions of sales and deliveries, these systems accurately predict how much of each item will be sold, when to replenish stock at fulfillment centers, and how to bundle deliveries to minimize travel distances. But as the coronavirus pandemic crisis has changed our daily habits and life patterns, those predictions are no longer valid.
“In the CPG [consumer packaged goods] industry, the consumer buying patterns during this pandemic has shifted immensely,” Rajeev Sharma, SVP and global head of enterprise AI solutions & cognitive engineering at AI consultancy firm Pactera Edge, told Gizmodo. “There is a tendency of panic buying of items in larger quantities and of different sizes and quantities. The [AI] models may have never seen such spikes in the past and hence would give less accurate outputs.”
Artificial intelligence algorithms are behind many changes to our daily lives in the past decades. They keep spam out of our inboxes and violent content off social media, with mixed results. They fight fraud and money laundering in banks. They help investors make trade decisions and, terrifyingly, assist recruiters in reviewing job applications. And they do all of this millions of times per day, with high efficiency—most of the time. But they are prone to becoming unreliable when rare events like the covid-19 pandemic happen.
Among the many things the coronavirus outbreak has highlighted is how fragile our AI systems are. And as automation continues to become a bigger part of everything we do, we need new approaches to ensure our AI systems remain robust in face of black swan events that cause widespread disruptions.
Why AI algorithms fail
Key to the commercial success of AI is advances in machine learning, a category of algorithms that develop their behavior by finding and exploiting patterns in very large sets of data. Machine learning and its more popular subset deep learning have been around for decades, but their use had previously been limited due to their intensive data and computational requirements. In the past decade, the abundance of data and advances in processor technology have enabled companies to use machine learning algorithms in new domains such as computer vision, speech recognition, and natural language processing.
When trained on huge data sets, machine learning algorithms often ferret out subtle correlations between data points that would have gone unnoticed to human analysts. These patterns enable them to make forecasts and predictions that are useful most of the time for their designated purpose, even if they’re not always logical. For instance, a machine-learning algorithm that predicts customer behavior might discover that people who eat out at restaurants more often are more likely to shop at a particular kind of grocery store, or maybe customers who shop online a lot are more likely to buy certain brands.
“All of those correlations between different variables of the economy are ripe for use by machine learning models, which can leverage them to make better predictions. But those correlations can be ephemeral, and highly context-dependent,” David Cox, IBM director at the MIT-IBM Watson AI Lab, told Gizmodo. “What happens when the ground conditions change, as they just did globally when covid-19 hit? Customer behavior has radically changed, and many of those old correlations no longer hold. How often you eat out no longer predicts where you’ll buy groceries, because dramatically fewer people eat out.”
As consumers change their habits, the intrinsic correlations between the myriad variables that define the behavior of a supply chain fall apart, and those old prediction models lose their relevance. This can result in depleted warehouses and delayed deliveries on a large scale, as Amazon and other companies have experienced. “If your predictions are based on these correlations, without an understanding of the underlying causes and effects that drive those correlations, your predictions will be wrong,” said Cox.
The same impact is visible in other areas, such as banking, where machine learning algorithms are tuned to detect and flag sudden changes to the spending habits of customers as possible signs of compromised accounts. According to Teradata, a provider of analytics and machine learning services, one of the companies using its platform to score high-risk transactions saw a fifteen-fold increase in mobile payments as consumers started spending more online and less in physical stores. (Teradata did not disclose the name of the company as a matter of policy.) Fraud-detection algorithms search for anomalies in customer behavior, and such sudden shifts can cause them to flag legitimate transactions as fraudulent. According to the firm, it was able to maintain the accuracy of its banking algorithms and adapt them to the sudden shifts caused by the lockdown.
But the disruption was more fundamental in other areas such as computer vision systems, the algorithms used to detect objects and people in images.
“We’ve seen several changes in underlying data due to covid-19, which has had an impact on performances of individual AI models as well as end-to-end AI pipelines,” said Atif Kureishy, VP of global emerging practices, artificial intelligence and deep learning for Teradata. “As people start wearing masks due to the covid-19, we have seen performance decay as facial coverings introduce missed detections in our models.”
Teradata’s Retail Vision technology uses deep learning models trained on thousands of images to detect and localize people in the video streams of in-store cameras. With powerful and potentially ominous capabilities, the AI also analyzes the video for information such as people’s activities and emotions, and combines it with other data to provide new insights to retailers. The system’s performance is closely tied to being able to locate faces in videos, but with most people wearing masks, the AI’s performance has seen a dramatic performance drop.
“In general, machine and deep learning give us very accurate-yet-shallow models that are very sensitive to changes, whether it is different environmental conditions or panic-driven purchasing behavior by banking customers,” Kureishy said.
We humans can extract the underlying rules from the data we observe in the wild. We think in terms of causes and effects, and we apply our mental model of how the world works to understand and adapt to situations we haven’t seen before.
“If you see a car drive off a bridge into the water, you don’t need to have seen an accident like that before to predict how it will behave,” Cox said. “You know something (at least intuitively) about why things float, and you know things about what the car is made of and how it is put together, and you can reason that the car will probably float for a bit, but will eventually take on water and sink.”
Machine learning algorithms, on the other hand, can fill the space between the things they’ve already seen, but can’t discover the underlying rules and causal models that govern their environment. They work fine as long as the new data is not too different from the old one, but as soon as their environment undergoes a radical change, they start to break.
“Our machine learning and deep learning models tend to be great at interpolation—working with data that is similar to, but not quite the same as data we’ve seen before—but they are often terrible at extrapolation—making predictions from situations that are outside of their experience,” Cox says.
The lack of causal models is an endemic problem in the machine learning community and causes errors regularly. This is what causes Teslas in self-driving mode to crash into concrete barriers and Amazon’s now-abandoned AI-powered hiring tool to penalize a job applicant for putting “women’s chess club captain” in her resume.
A stark and painful example of AI’s failure to understand context happened in March 2019, when a terrorist live-streamed the massacre of 51 people in New Zealand on Facebook. The social network’s AI algorithm that moderates violent content failed to detect the gruesome video because it was shot in first-person perspective, and the algorithms had not been trained on similar content. It was taken down manually, and the company struggled to keep it off the platform as users reposted copies of it.
Major events like the global pandemic can have a much more detrimental effect because they trigger these weaknesses in a lot of automated systems, causing all sorts of failures at the same time.
How to deal with black swan events
“It is imperative to understand that the AI/ML models trained on consumer behavior data are bound to suffer in terms of their accuracy of prediction and potency of recommendations under a black swan event like the pandemic,” said Pactera’s Sharma. “This is because the AI/ML models may have never seen that kind of shifts in the features that are used to train them. Every AI platform engineer is fully aware of this.”
This doesn’t mean that the AI models are wrong or erroneous, Sharma pointed out, but implied that they need to be continuously trained on new data and scenarios. We also need to understand and address the limits of the AI systems we deploy in businesses and organizations.
Sharma described, for example, an AI that classifies credit applications as “Good Credit” or “Bad Credit” and passes on the rating to another automated system that approves or rejects applications. “If owing to some situations (like this pandemic), there is a surge in the number of applicants with poor credentials,” Sharma said, “the models may have a challenge in their ability to rate with high accuracy.”
As the world’s corporations increasingly turn to automated, AI-powered solutions for deciding the fate of their human clients, even when working as designed, these systems can have devastating implications for those applying for credit. In this case, however, the automated system would need to be explicitly adjusted to deal with the new rules, or the final decisions can be deferred to a human expert to prevent the organization from accruing high risk clients on its books.
“Under the present circumstances of the pandemic, where model accuracy or recommendations no longer hold true, the downstream automated processes may need to be put through a speed breaker like a human-in-the-loop for added due diligence,” he said.
IBM’s Cox believes if we manage to integrate our own understanding of the world into AI systems, they will be able to handle black swan events like the covid-19 outbreak.
“We must build systems that actually model the causal structure of the world, so that they are able to cope with a rapidly changing world and solve problems in more flexible ways,” he said.
MIT-IBM Watson AI Lab, where Cox works, has been working on “neurosymbolic” systems that bring together deep learning with classic, symbolic AI techniques. In symbolic AI, human programmers explicitly specify the rules and details of the system’s behavior instead of training it on data. Symbolic AI was dominant before the rise of deep learning and is better suited for environments where the rules are clearcut. On the other hand, it lacks the ability of deep learning systems to deal with unstructured data such as images and text documents.
The combination of symbolic AI and machine learning has helped create “systems that can learn from the world, but also use logic and reasoning to solve problems,” Cox said.
IBM’s neurosymbolic AI is still in the research and experimentation stage. The company is testing it in several domains, including banking.
Teradata’s Kureishy pointed to another problem that is plaguing the AI community: labeled data. Most machine learning systems are supervised, which means before they can perform their functions, they need to be trained on huge amounts of data annotated by humans. As conditions change, the machine learning models need new labeled data to adjust themselves to new situations.
Kureishy suggested that the use of “active learning” can, to a degree, help address the problem. In active learning models, human operators are constantly monitoring the performance of machine learning algorithms and provide them with new labeled data in areas where their performance starts to degrade. “These active learning activities require both human-in-the-loop and alarms for human intervention to choose what data needs to be relabeled, based on quality constraints,” Kureishy said.
But as automated systems continue to expand, human efforts fail to meet the growing demand for labeled data. The rise of data-hungry deep learning systems has given birth to a multibillion-dollar data-labeling industry, often powered by digital sweatshops with underpaid workers in poor countries. And the industry still struggles to create enough annotated data to keep machine learning models up to date. We will need deep learning systems that can learn from new data with little or no help from humans.
“As supervised learning models are more common in the enterprise, they need to be data-efficient so that they can adapt much faster to changing behaviors,” Kureishy said. “If we keep relying on humans to provide labeled data, AI adaptation to novel situations will always be bounded by how fast humans can provide those labels.”
Deep learning models that need little or no manually labeled data is an active area of AI research. In last year’s AAAI Conference, deep learning pioneer Yann LeCun discussed progress in “self-supervised learning,” a type of deep learning algorithm that, like a child, can explore the world by itself without being specifically instructed on every single detail.
“I think self-supervised learning is the future. This is what’s going to allow our AI systems to go to the next level, perhaps learn enough background knowledge about the world by observation, so that some sort of common sense may emerge,” LeCun said in his speech at the conference.
But as is the norm in the AI industry, it takes years—if not decades—before such efforts become commercially viable products. In the meantime, we need to acknowledge and embrace the power and limits of current AI.
“These are not your static IT systems,” Sharma says. “Enterprise AI solutions are never done. They need constant re-training. They are living, breathing engines sitting in the infrastructure. It would be wrong to assume that you build an AI platform and walk away.”
Ben Dickson is a software engineer, tech analyst, and the founder of TechTalks.