Artificial intelligence has won in multiplayer poker. The final frontier has been broken!
Two years ago, Libratus won a poker game with four pros. The multi-day tournament was one-on-one and the people suffered a terrible defeat. It was a huge step forward for artificial intelligence, but even the co-founder of Libratus, Professor Tuomas Sandholm, did not believe that the AI could cope with more players at the same time. The scientist had just proved himself wrong.
Sandholm is the co-author of an algorithm called Pluribus, which has just beaten six pros in unlimited Texas Hold'em. I didn't think it would be possible in my lifetime, said the scientist.
So far, artificial intelligence has been doing better and better in games with people, but it was a one-on-one or team game, two against two. The AI amazed with its achievements in checkers, chess, Go and poker. All matches were zero-sum games. One side won and the other lost. However, playing against six people is a completely different level of difficulty. It is more like a real situation when you need to make decisions without knowing the resources (cards) and decision-making process of your opponents. This is the first serious test of AI's ability in a situation other than a duel or a fight between two teams and a zero-sum game. This is the first time we have gone beyond this paradigm and shown that SI is doing well in such situations, says Pluribus co-founder Noam Brown, who works for Facebook AI Research.
Pluribus started with 1 man and 5 independent versions of Pluribus. With time he reached the level where he could win with 5 professionals at the same time. The opponents of artificial intelligence were 15 changing professional players, each of whom had previously won at least a million dollars in poker. 10,000 hands were played and the tournament lasted 12 days.
Although Pluribus did not win over the people as overwhelming as Libratus, his achievements surprised the experts. There was some evidence to suggest that the SI techniques used in a poker duel should also work with three players, but it was not clear whether they could be applied to more opponents playing at the highest level. It is really sensational news that they proved their worth in a six-person match. This is an important milestone," says Professor Michael Wellman of the University of Michigan.
Pluribus, like Libratus, learned poker by playing a lot of simulated duels with himself. According to its creators, the success of the program lies in the use of "limited depth search". This mechanism allows the AI to calculate a few moves forward for all opponents and to develop the best strategy on this basis. This type of tactic is used by many poker programs, but its use in a six-player game requires a huge amount of memory to store all possible moves of all opponents and all possible bets. Libratus dealt with this problem by considering only the last two rounds of conquests. However, it still required the use of 100 processors for a two-player game.
Pluribus worked a little differently. He only considered four possible behaviors of the opponent. One is the calculated most probable move, the other is where the opponent leans towards the belt, the third is when the opponent rather chooses to check and the last is when the opponent rather raises the bet. Thanks to this it was possible to significantly reduce the required counting resources. The algorithms used were extremely efficient. Suffice it to say that during the live show Pluribus was running on a machine with only two processors and 128 GB of RAM. It is astonishing that this was achieved at all and that it was achieved without using the computational power of GPUs and other extremely powerful hardware," says Sandholm. Suffice it to say that the AlphaGo program that defeated Go Lee Sedola in 2016 used 1920 CPUs and 280 GPUs.
The experts from Carnegie Mellon University and Facebook, who created Pluribus, will only publish its pseudo-code, which is a description of the steps needed to create a similar program. However, they decided not to release the real code in order not to facilitate the distribution of poker software. This could destroy both the business and the gaming community.
The SI algorithm used can be used wherever decisions need to be made without full knowledge of what others are doing or thinking. It will be useful in areas such as cyber security, trade, business negotiations or price fixing. According to Sandhlom, it can also help in the upcoming US presidential elections by helping candidates to determine the level of spending needed to win in key states. Sandholm has already established three companies that will provide SI-enabled services in business and military markets.
Interesting
Congratulations @imbagaming! You received a personal award!
You can view your badges on your Steem Board and compare to others on the Steem Ranking
Do not miss the last post from @steemitboard:
Vote for @Steemitboard as a witness to get one more award and increased upvotes!