Two artificial intelligence (AI) programs have finally proven they “know when to hold’em, and when to fold
em,” recently beating human professional card players for the first time at the popular poker game of Texas
Hold’em. And this week the team behind one of those Als, known as DeepStack, has divulged some of the
secrets to its success-a triumph that could one day lead to Als that perform tasks ranging from from beefing
up airline security to simplifying business negotiations.
tyear one conquered Go
Als have long dominated games such as chess, and last year one conquered Go, but they have made relatively
lousy poker players. In DeepStack researchers have broken their poker losing streak by combining new
algorithms and deep machine learning, a form of computer science that in some ways mimics the human brain,
allowing machines to teach themselves.
“It’s a… a scalable approach to dealing with complex information) that could quickly make a very good
decision even better than people,” says Murray Campbell, a senior researcher at IBM in Armonk, New York,
and one of the creators of the chess-besting AI, Deep Blue.
Chess and Go have one important thing in common that let Als beat them first: They’re perfect information
games. That means both sides know exactly what the other is working with—a huge assist when designing an
AI player. Texas Hold’em is a different animal. In this version of poker, two or more players are randomly dealt
two face-down cards. At the introduction of each new set of public cards, players are asked to bet, hold, or
abandon the money at stake on the table. Because of the random nature of the game and two initial private
cards, players’ bets are predicated on guessing what their opponent might do. Unlike chess, where a winning
strategy can be deduced from the state of the board and all the opponent’ s potential moves, Hold ’em
requires what we commonly call intuition.
The aim of traditional game-playing Als is to calculate the possible results of a game as far as possible and then
rank the strategy options using a formula that searches data from other winning games. The downside to this
method is that in order to compress the available data, algorithms sometimes group together strategies that
don’t actually work, says Michael Bowling, a computer scientist at the University of Alberta in Edmonton,
His team’ s poker AI, DeepStack, avoids abstracting data by only calculating ahead a few steps rather than an
entire game. The program continuously recalculates its algorithms as new information is acquired. When the AI
needs to act before the opponent makes a bet or holds and does not receive new information, deep learning
steps in. Neural networks, the systems that enact the knowledge acquired by deep learning, can help limit the
potential situations factored by the algorithms because they have been trained on the behavior in the game.
This makes the Al’ s reaction both faster and more accurate, Bowling says. In order to train DeepStack’s
neural networks, researchers required the program to solve more than 10 million randomly generated poker
To test DeepStack, the researchers pitted it last year against a pool of 33 professional poker players selected by
the International Federation of Poker. Over the course of 4 weeks, the players challenged the program to
44,852 games of heads-up no-limit Texas Hold’em, a two-player version of the game in which participants
can bet as much money as they have. After using a formula to eliminate instances where luck, not strategy,
caused a win, researchers found that DeepStack’s final win rate was 486 milli-big-blinds per game. A milli-
big-blind is one-thousandth of the bet required to win a game. That’ s nearly 10 times that of what
professional poker players consider a sizable margin, the team reports this week in Science.
The team’s findings coincide with the very public success several weeks ago of Libratus, a poker AI designed
by researchers at Carnegie Mellon University in Pittsburgh, Pennsylvania. In a 20-day poker competition held in
Pittsburgh, Libratus bested four of the top-ranked human Texas Hold’ em players in the world over the course
of 120,000 hands. Both teams say their system’s superiority over humans is backed by statistically significant
findings. The main difference is that, because of its lack of deep learning, Libratus requires more computing
power for its algorithms and initially needs to solve to the end of the every time to create a strategy, Bowling
says. DeepStack can run on a laptop.
Though there’ s no clear consensus on which AI is the true poker champ—and no match between the two has
been arranged so far—both systems have are already being adapted to solve more complex real-world
problems in areas like security and negotiations. Bowling’ s team has studied how AI could more successfully
randomize ticket checks for honor-system public transit.
Researchers are also interested in the business implications of the technology. For example, an AI that can
understand imperfect information scenarios could help determine what the final sale price of a house would be
for a buyer before knowing the other bids, allowing that buyer to better plan on a mortgage. A system like
AlphaGo, the perfect information game-playing Aſ that defeated a Go world champion last year, couldn’t do
this because of the lack of limitations on the possible size and number of other bids.
Still, DeepStack is a few years away from truly being able to mimic complex human decision making, Bowling
says. The machine still has to learn how to more accurately handle scenarios where the rules of the game are
not known in advance, like versions of Texas Hold ’em that its neural networks haven’t been trained for, he
Campbell agrees. “While poker is a step more complex than perfect information games,” he says, “it’s still
a long way to go to get to the messiness of the real world.”