There are two unmistakable sides to the debate concerning the future of artificial intelligence. In the “boom” corner are companies like Google, Facebook, Amazon, and Microsoft aggressively investing in technology to make AI systems smarter and smarter. And in the “doom” corner are prominent thinkers like Elon Musk and Stephen Hawking who’ve said that AI is like “summoning the demon.”
Now, one of the the most advanced AI outfits, Google’s DeepMind, is taking safety measures in case human operators need to “take control of a robot that is misbehaving [that] may lead to irreversible consequences,” which I assume includes but is not limited to killing all humans. However, this paper doesn’t get nearly so apocalyptic and keeps examples simple, like intelligent robots working in a factory.
The published document was a joint effort between Google’s DeepMind and Oxford University’s Future of Humanity Institute, which as its name suggests, wants there to be a future for humanity. Founding director Nick Bostrom has been vocal about the possible dangers of developing AI for decades and has written whole books discussing the implications of super-intelligent robots.
This particular paper, titled “Safely Interruptible Agents,” investigates how to turn off AI if it starts doing something its human operator doesn’t want it to do. The paper is filled with math 99 percent of us will never understand, which basically describes methods for building what the paper cheekily calls a “big red button” into AI.
The researchers have seen the same movies you have. You know, the one where the robot learns to ignore a turn-off command. They’re prepared for that.
This paper explores a way to make sure a learning agent will not learn to prevent (or seek!) being interrupted by the environment or a human operator.
It may seem like overkill or a needless limitation considering the most impressive AI achievement humanity has ever seen is that it is really good at board games. But Bostrom has theorized before that it really only takes constructing a human-level AI to quickly catapult robot brains beyond our own:
Once artificial intelligence reaches human level, there will be a positive feedback loop that will give the development a further boost. AIs would help constructing better AIs, which in turn would help building better AIs, and so forth.
Better safe than sorry.