AlphaGo: using machine learning to master the ancient game of Go

时间：2017-07-18 13:51:46 来源：作者：

Demis Hassabis

CEO AND CO-FOUNDERDEEPMIND

The game of Go originated in China more than 2,500 years ago. Confucius wrote about the game, and it is considered one of the four essential arts required of any true Chinese scholar. Played by more than 40 million people worldwide, the rules of the game are simple: Players take turns to place black or white stones on a board, trying to capture the opponent's stones or surround empty space to make points of territory. The game is played primarily through intuition and feel, and because of its beauty, subtlety and intellectual depth it has captured the human imagination for centuries.XKq快充网络

But as simple as the rules are, Go is a game of profound complexity. There are 1,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 possible positions—that’s more than the number of atoms in the universe, and more than a googol times larger than chess.XKq快充网络

This complexity is what makes Go hard for computers to play, and therefore an irresistible challenge to artificial intelligence (AI) researchers, who use games as a testing ground to invent smart, flexible algorithms that can tackle problems, sometimes in ways similar to humans. The first game mastered by a computer was noughts and crosses (also known as tic-tac-toe) in 1952. Then fell checkers in 1994. In 1997 Deep Blue famously beat Garry Kasparov at chess. It’s not limited to board games either—IBM's Watson [PDF] bested two champions at Jeopardy in 2011, and in 2014 our own algorithms learned to play dozens of Atari games just from the raw pixel inputs. But to date, Go has thwarted AI researchers; computers still only play Go as well as amateurs.XKq快充网络

Traditional AI methods—which construct a search tree over all possible positions—don’t have a chance in Go. So when we set out to crack Go, we took a different approach. We built a system, AlphaGo, that combines an advanced tree search with deep neural networks. These neural networks take a description of the Go board as an input and process it through 12 different network layers containing millions of neuron-like connections. One neural network, the “policy network,” selects the next move to play. The other neural network, the “value network,” predicts the winner of the game.XKq快充网络

We trained the neural networks on 30 million moves from games played by human experts, until it could predict the human move 57 percent of the time (the previous record before AlphaGo was 44 percent). But our goal is to beat the best human players, not just mimic them. To do this, AlphaGo learned to discover new strategies for itself, by playing thousands of games between its neural networks, and adjusting the connections using a trial-and-error process known as reinforcement learning. Of course, all of this requires a huge amount of computing power, so we made extensive use of Google Cloud Platform.XKq快充网络

After all that training it was time to put AlphaGo to the test. First, we held a tournament between AlphaGo and the other top programs at the forefront of computer Go. AlphaGo won all but one of its 500 games against these programs. So the next step was to invite the reigning three-time European Go champion Fan Hui—an elite professional player who has devoted his life to Go since the age of 12—to our London office for a challenge match. In a closed-doors match last October, AlphaGo won by 5 games to 0. It was the first time a computer program has ever beaten a professional Go player. You can find out more in our paper, which was published in Nature today.XKq快充网络

XKq快充网络

来顶一下

返回首页

推荐资讯

相关文章

无相关信息

栏目更新

栏目热门