Who says shame can’t be an effective motivator? Less than a week after we shared Wesley Liao’s experiments using machine learning to train an AI to play QWOP, one of the hardest video games of all time, the AI was re-trained with the goal of maximizing its speed, resulting in a new world record.
Starting with their previous AI agent named ACER that was trained with a focus on optimal running techniques and form, Liao trained a new agent with a modified reward system. Previously, behaviors like “low torso height, vertical torso movement, and excessive knee bending” were discouraged to help ACER learn a proper stride technique.
But since the new AI agent was learning from ACER that had already mastered its stride, the machine learning process instead solely focused on rewarding improvements made to the sprinter’s forward velocity. Aside from a couple of minutes of “pre-training,” the new AI required just 40 hours of training to finally beat the best human QWOP players.
A website called Speedrun.com is where you’ll find the actively updated leaderboard for the QWOP 100 meter-dash, and while the top human player (Japan’s gunmaneko) managed to get their sprinter across the finish line in 48.34 seconds, the best recorded run of Liao’s newly trained AI did it in 47.34 seconds. But don’t expect to see Liao’s name atop the QWOP leaderboard. Speedrunning is still a competition for human players only and the use of software tools, such as an AI, to assist a run is strictly forbidden.
Do we need separate speedrunning leaderboards for AI players? Sure, why not? There’s good reason to keep a cautious eye on the incredible advancements we’ve made with artificial intelligence, but it’s also just plain fascinating to see how quickly they can be trained to best a human competitor. Despite being so challenging, QWOP is a very rudimentary video game that focuses on the precise timing of button presses. It would also be interesting to watch an AI tackle a game like The Legend of Zelda series where interactions with other AI-powered characters come into play. In the process, an AI agent like Liao’s may even find shortcuts, techniques, or gameplay strategies that could assist human speedrunners too. In the meantime can we at least get this QWOP-playing AI a participation trophy?