The Future Is Here
We may earn a commission from links on this page

DeepMind's New AI Uses Game Theory to Trounce Humans in 'Stratego'

The "DeepNash" AI had a 97% win rate against other models and an 84% win rate against top human players.

We may earn a commission from links on this page.
Image for article titled DeepMind's New AI Uses Game Theory to Trounce Humans in 'Stratego'
Screenshot: DeepMind

Humans are quickly running out of board games we can still play without being utterly clobbered by AI. In the past, researchers demonstrated AI’s ability to best humans at chess, Go, and, recently, Diplomacy. Now, you can now add the strategy game Stratego to that ever-growing list.

Researchers from Alphabet-owned DeepMind, according to new research shared with Gizmodo, say they’ve created a new AI agent capable of playing Stratego at a “human expert level.” The AI, called DeepNash, won nearly all of the matches it played against other AI’s and had an 84% overall win rate when competing against human players in online games. DeepNash, which learned to master the game by playing against itself, was able to make complex decisions and consider tradeoffs in “extraordinary” ways previous AI systems couldn’t.


While Stratego may not initially seem like the most obvious example for training an AI, the researchers say the game’s combination of longer-term decision making and imperfect influx of imperfect information make it a unique test bed. The game is typically played by two players and involves both strategy and deception. Players each have their own “armies” made up of pieces each with their own respective values. Players win by either capturing an opponent’s flag or capturing all of their moveable pieces.

All of those pieces with their different values result in an extremely large amount of possible moves and outcomes. The researchers said Stratego has far more “possible states” than Texas Hold ‘em poker, and even more than Go, which is often heralded for its immense variety of possible choices.


To win, DeepNash mixed both long term strategy and short term decision making like bluffing and taking chances. It’s rare that two of those things can be done at the same times so well by an AI agent. Stratego’s combination of long, strategic thinking and making decisions based on incomplete or limited information have mostly thwarted past AI models.

‘“DeepNash was able to make nontrivial trade-offs between information and material, to execute bluffs, and to take gambles when needed,” the researchers write.

DeepNash appears to draw influence from American mathematician John Nash who, among other things, coined The Nash Equilibrium. In a nutshell, that equilibrium refers to a solution in game theory where both opponents facing off against each other no longer have any incentive to deviate from their initial strategy. Of the many possible scenarios, the Nash Equilibrium, in game theory, is often considered the “optimal” outcome.

DeepNash at its core attempts to locate the Nash Equilibrium in Stratego games using a new combination of self play and model-free reinforcement algorithm learning called “R-NaD.” By using both that algorithm and deep neural network architecture, the researchers were able to create a winning bot, even in exceedingly complex situations. Though DeepNash was trained to compete in Stratego, DeepMind appears to have created a game theory genius.


Researchers tested DeepNash by facing it off against other bots and against “top human players” on the online gaming platform Gravon. DeepNash achieved a minimum win rate of 97% against the bots. Its performance against humans was only slightly worse, with an overall win rate of 84%. The AI ranked among the top three players both in year-to-date and all time leaderboard.

“To the best of our knowledge, this is the first time an AI algorithm was able to learn to play Stratego at a human expert level,” the researchers said.