The program discovers associations between textual and visual data, learning to match rich sets of phrases with pixels in an image. This way, it can recognize instances of specific concepts when it sees them.

Advertisement

LEVAN learns which terms are relevant by analyzing the content of the images found on the Web and identifying characteristic patterns across them using recognition algorithms. Currently, users can browse a library of about 175 concepts, such as "airline" "window," "beautiful," "breakfast," "shiny," "cancer," "innovation," "skateboarding," "robot," and "horse."

Advertisement

If the concept is not in the library, users can submit a query here. But be prepared to wait; currently, LEVAN is limited in how fast it can learn a concept on account of the tremendous computational power required to crunch each query, which can take as much as 12 hours. The researchers are currently looking at ways to increase LEVAN's processing speed and capabilities.

After submitting the new query, the program automatically generates an exhaustive list of subcategory images that relate to that concept. For example, a search for "dog" brings up the obvious collection of subcategories: photos of "Chihuahua dog," "black dog," "swimming dog," "scruffy dog," "greyhound dog" — but also "dog nose," "dog bowl," "sad dog," "ugliest dog," "hot dog" and even "down dog," as in the yoga pose.

Advertisement

Encyclopedia 2.0

The system works by searching the text from literally millions of books written in English and available on Google Books. It relentlessly scours every occurrence of the concept in the entire digital library. An algorithm then filters out words that aren't visual (e.g. "jumping horse" would be included, but "my horse" would not).

Advertisement

After acquiring the relevant phrases, LEVAN performs an image search on the Web, hunting down uniformity in appearance among the photos gathered. As it's trained to find relevant images, it then recognizes

Advertisement

all images associated with the given phrase.

"Major information resources such as dictionaries and encyclopedias are moving toward the direction of showing users visual information because it is easier to comprehend and much faster to browse through concepts," noted lead researcher Santosh Divvala in a release. "However, they have limited coverage as they are often manually curated. The new program needs no human supervision, and thus can automatically learn the visual knowledge for any concept."

Advertisement

To date, LEVAN tagged more than 13 million images with 65,000 different phrases.

Looking Ahead

Given the pace of these developments, it's reasonable to assume that future systems will not only be capable of learning and teaching themselves, but also be programmed to act on that acquired information or new skills. This could include the design of new products or medicines — or even the refinement of its own programming, leading to the dreaded concept known as recursively improving artificial intelligence.

Advertisement

The researchers will present the project later this month at the Computer Vision and Pattern Recognition annual conference in Columbus, Ohio. Here's a link to the paper: "Learning Everything About Anything: Webly-Supervised Visual Concept Learning." (pdf) Supplemental information via University of Washington.

Images: Person of Interest (CBS) | Divvala et. al/University of Washington

Follow me on Twitter: @dvorsky