Tech. Science. Culture.
We may earn a commission from links on this page

Google Has Made It Simple for Anyone to Tap Into Its Image Recognition AI

We may earn a commission from links on this page.

Google released a new AI tool on Wednesday designed to let anyone train its machine learning systems on a photo dataset of their choosing. The software is called Cloud AutoML Vision. In an accompanying blog post, the chief scientist of Google’s Cloud AI division explains how the software can help users without machine learning backgrounds harness artificial intelligence.

All hype aside, training the AI does appear to be surprisingly simple. First, you’ll need a ton of tagged images. The minimum is 20, but the software supports up to 10,000. Using a meteorologist as an example for their promotional video was an apt choice by Google—not many people have thousands of tagged HD images bundled together and ready to upload. (Anime fans excluded, of course.)

A lot of image recognition is about identifying patterns. Once Google’s AI thinks it has a good understanding of what links together the images you’ve uploaded, it can be used to look for that pattern in new uploads, spitting out a number for how well it thinks the new images match it. So our meteorologist would eventually be able to upload images as the weather changes, identifying clouds while continuing to train and improve the software.

Being able to recognize patterns at enormous scales has immense interdisciplinary value. Oncologists have trained machine learning systems on images of breast cancer cells so they can spot the disease earlier. Neuroscientists have used algorithms on MRI scans to predict language development in children. And Stanford researchers have applied similar software to predict race and voting patterns in cities by matching census data to the frequency of specific brands of cars.


Hopefully AutoML Vision will energize more projects like these. Because while early detection is potentially life-saving, this AI could also unearth new, as of yet unproven patterns and correlations. In the hands of citizen scientists or investigative journalists, this could be transformative.

With AutoML Vision, the barrier to entry is primarily data collection—that is, capturing and correctly tagging thousands of images for training. There’s more ways to capture images than ever (via drones, cell phones, live feeds, or social media), but the means of capturing data is far from democratized. Hidden in the usual marketing speak of Google’s blog post, there’s a clear understanding that democratizing the technology could, eventually, reverberate through a number of fields.