Google’s AlphaGo Artificial Intelligence Can Now Teach Itself, Which Will Surely End Well

Shutterstock

On this day 30 years ago, stock markets around the world had one of their worst one-day drops ever (or their absolute worst, as was the case with the Dow Jones Industrial Average index) in what was later known as “Black Monday.” The crashes were exacerbated to a large extent by newfangled trading systems both automatically trading based on certain triggers and by those same systems becoming overwhelmed with too much volume. Our processing power has changed significantly, as evidenced by an artificial intelligence breakthrough reported in Nature.

DeepMind, Google’s AI division — who have previously built an AI program that taught itself how to play Breakout — have built an AI which taught itself how to play (and win) a significantly more complex game: Go, also known as Othello and Weiqi. Their AlphaGo AI already beat the world champion Go player Lee Sedol two years ago, but the new version of AlphaGo is trained differently. It trained itself by competing against itself, without human examples or feedback (unlike the nunchuck robot), and it bested the earlier, world-champ-beating version of AlphaGo in 100 out of 100 games after only 24 hours of practice. As for the processing power, AlphaGo can cycle through about 5 million training games per day, far more games than Go champ Lee Sedol has played in his entire lifetime.

The new AlphaGo Zero started with just knowledge of the rules and learned from the success of a million random moves it made against itself. […] “AIs based on reinforcement learning can perform much better than those that rely on human expertise,” writes computer scientist Satinder Singh in his accompanying article. (Via)

Although Go is a complex game (more complex than chess), don’t expect this self-teaching method to be immediately applied to much more ambiguous and dangerous situations like driving a car. This does go to show how reinforcement learning can advance AI research, as previously evidenced by another reinforcement learning project that pit AI programs against each other in a game of DOOM. Doom, huh? That seems prescient.

(Via aXios and Nature)