The grandmaster-beating AlphaGo “artificial intelligence,” developed by Google’s DeepMind division, stopped playing Go against mere humans back in May. However, that iteration of the algorithm has now been thoroughly thrashed by DeepMind’s new AlphaGo Zero.
While it sounds like some sort of soda, AlphaGo Zero may represent as much of a breakthrough as its predecessor, since it could presage the development of algorithms with skills that humans do not have.
AlphaGo achieved its dominance in the game of Go by studying the moves of human experts and by playing against itself—a technique known as reinforcement learning. AlphaGo Zero, meanwhile, trained itself entirely through reinforcement learning.
And, despite starting with no tactical guidance or information beyond the rules of the game, the newer algorithm managed to beat the older AlphaGo by 100 games to zero.
This is important because it could have major implications for the development of AIs with superhuman skills. As DeepMind researchers explained in a new Nature paper, it’s all very well to train AIs on human decisions, but the datasets derived from human experts are “often expensive, unreliable or simply unavailable.”
“Even when reliable data sets are available, they may impose a ceiling on the performance of systems trained in this manner,” the researchers wrote. “By contrast, reinforcement learning systems are trained from their own experience, in principle allowing them to exceed human capabilities, and to operate in domains where human expertise is lacking.”
What’s more, AlphaGo Zero developed its skills within just a few days, whereas the earlier AlphaGo took months of training to achieve its human-beating nature.
“We’re quite excited because we think this is now good enough to make some real progress on some real problems even though we’re obviously a long way from full AI,” DeepMind CEO Demis Hassabis told the BBC.