Google's AI Safety Rules Are Way More Boring Than Asimov's

We may earn a commission from links on this page.

Good news, everyone: Google is looking out for us! It doesn’t want a cleaning robot to knock over grandma’s vase.

That’s one of the potentially dangerous scenarios Google engineers explored in a new paper on “concrete problems in AI safety.” For some strange reason, the researchers chose to illustrate these dangers using the example of a cleaning robot and not a sentient superintelligence that wants to enslave us.

In terms of elegance, they don’t hold a candle to Isaac Asimov’s Three Laws of Robotics, but I suppose that’s to be expected when you’re writing an academic paper and not a speculative science-fiction novel that doesn’t actually have to obey the laws of physics.


Here are the five problems Google thinks will be important to focus on:

  1. Avoiding negative side effects, or, “how do we make sure a cleaning robot doesn’t knock over a vase because it’s faster to do so?”
  2. Avoid reward hacking, or, “how can we make sure the cleaning robot doesn’t just cover messes instead of cleaning?”
  3. Scalable oversight, or, “how do we make sure the cleaning robot learns quickly and doesn’t ask too often where the mop is?”
  4. Safe exploration, or, “how do we make sure the robot explores cleaning strategies but doesn’t put a mop in an electrical outlet and burn the entire house down?”
  5. Robustness to distributional shift, or, “how do we teach the robot to recognize when its skills are not useful in a different environment?”

Maybe the cleaning-robot examples make sense given that Eric Schmidt doesn’t think very highly of AI fears. (Then again, if that’s the case, why is Google working on a kill switch?)

Anyway. Google, if you figure out how to quickly teach all these things, I know some parents who might want your help, and some kids who still spend all their time trying to hide their mess instead of cleaning up. I mean, kids who have not yet been programmed to avoid “reward hacking.”


[Google via CNET]