Our robot overlords may have taken their next step to global domination following reports that researchers at ETH Zurich have created an AI robot named CyberRunner whose task was to learn how to play the popular and widely accessible labyrinth marble game better than a human.
The labyrinth is a game of physical skill, whose goal is to steer a marble from a given start point to the end point. In doing so, the player must prevent the ball from falling into any of the holes that are present on the labyrinth board.
The movement of the ball can be indirectly controlled by two knobs that change the orientation of the board. While it is a relatively straightforward game, it requires fine motor skills and spatial reasoning abilities, and, from experience, humans require a great amount of practice to become proficient at the game.
CyberRunner applies recent advances in model-based reinforcement learning to the physical world and exploits its ability to make informed decisions about potentially successful behaviors by planning real-world decisions and actions into the future.
Just like humans, the robot learns through experience. While playing the game, it captures observations and receives rewards based on its performance, all through the ‘eyes’ of a camera looking down at the labyrinth.
A memory is kept of the collected experience. Using this memory, the model-based reinforcement learning algorithm learns how the system behaves, and based on its understanding of the game it recognizes which strategies and behaviours are more promising.
Consequently, the way the robot uses the two motors – its ‘hands’ – to play the game is continuously improved. Importantly, the robot does not stop playing to learn, but the algorithm runs concurrently with the robot playing the game. As a result, the robot keeps getting better, run after run.
The learning on the real-world labyrinth was conducted in 6.06 hours, comprising 1.2 million time steps at a control rate of 55 samples per second. The AI robot outperformed the previously fastest recorded time, achieved by an extremely skilled human player, by over six per cent.
During the learning process, CyberRunner also discovered shortcuts and waysto ‘cheat’ by skipping certain parts of the maze. The lead researchers, Thomas Bi and Prof. Raffaello D’Andrea, had to step in and explicitly instruct it not to take any of those shortcuts.
A preprint of the research paper is available on the project website, www.CyberRunner.ai. In addition, Bi and D’Andrea will open source the project and make it available on the website.
D’Andrea said: “We believe that this is the ideal testbed for research in real-world machine learning and AI. Prior to CyberRunner, only organizations with large budgets and custom-made experimental infrastructure could perform research in this area. Now, for less than 200 dollars, anyone can engage in cutting-edge AI research. Furthermore, once thousands of CyberRunners are out in the real-world, it will be possible to engage in large-scale experiments, where learning happens in parallel, on a global scale. The ultimate in Citizen Science.”