We've updated our Privacy Policy to make it clearer how we use your personal data. We use cookies to provide you with a better experience. You can read our Cookie Policy here.


DeepMind's AlphaGo Zero: Learns From Scratch Without Any Human Input

Listen with
Register for free to listen to this article
Thank you. Listen to this article using the player above.

Want to listen to this article for FREE?

Complete the form below to unlock access to ALL audio articles.

Read time: 1 minute

Demis Hassabis’ Google-backed artificial intelligence (AI) company DeepMind has developed a self-training AI program AlphaGo Zero that beat the previous ‘best player in the world’, another DeepMind AI known as AlphaGo Master.

Whereas previous versions had learned from the grand masters to predict outcomes and improve their performance at the game. AlphaGO Zero learned to play the game from scratch, with no human interaction. 

In their paper published in Nature, the researchers incorporated a single neural network and developed algorithms that resulted in rapid improvement and stable learning. Earlier versions used two neural networks, a “policy network” to select the next move and a “value network” to predict the winner of the game from each position. Combining these in AlphaGo Zero greatly improved efficiency, needing 40x less energy than the earlier version that overcame European champion Fan Hui in 2015.

AI Learning from scratch

Demis Hassabis, CEO of DeepMind, and David Silver, Research Scientist and lead author, explain in their blog post, “It is able to do this by using a novel form of reinforcement learning, in which AlphaGo Zero becomes its own teacher.” Adding, “The system starts off with a neural network that knows nothing about the game of Go. It then plays games against itself, by combining this neural network with a powerful search algorithm. As it plays, the neural network is tuned and updated to predict moves, as well as the eventual winner of the games.”

With each iterative process the system improves, meaning the games it plays against itself get harder, which in turn improves the accuracy of the neural network, and therefore the whole system.

In this way, AlphaGo Zero begins as a clean slate and learns from itself. Learning tabula rasa, AlphaGo Zero was able to outperform the previous Go champion, winning by 100 games to nil.

The future: self-taught AI could solve our difficult issues

The ability of artificial intelligence to teach itself could facilitate answers to some of the questions currently beyond scientific understanding. As the authors highlight, “similar techniques can be applied to other structured problems, such as protein folding, reducing energy consumption or searching for revolutionary new materials, the resulting breakthroughs have the potential to positively impact society.”

Despite the rate at which processing speed increases and computational power advances being dependent on Moore’s law, it is not hard to imagine a future where artificial intelligence, using systems such as those developed by DeepMind, will be playing a role in our everyday lives. 

Reference to paper:

Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., . . . Hassabis, D. (2017). Mastering the game of Go without human knowledge. Nature, 550(7676), 354-359.