Tech. Science. Culture.
We may earn a commission from links on this page

Anti-Spam Turing Test Is Really Global Human-Powered OCR System

We may earn a commission from links on this page.

You know the test you have to take on Digg or Facebook, the one that proves you're a human? You see a hard-to-read word or string of gibberish, and you type in the correct characters. Carnegie Mellon researchers decided to replace randomly generated words with actual words from ancient manuscripts, words that machines are having trouble deciphering. When you or millions of other users type in a word, you are beating a machine and helping to preserve an irreplaceable text.

The original test is called the Completely Automated Turing Test To Tell Computers and Humans Apart, or CAPTCHA. This is CMU-originated modification is called reCAPTCHA. Instead of seeing one word, you see two, one that is already verified as correct. If you think about it, that's the only way the authentication could work. Both words are further distorted to fight spammers who may well have better OCR than the libraries.

Sites like Facebook and Twitter have already started using reCAPTCHA, and right now it's processing one million words per day. That's still chump change, though. According to Luis von Ahn, a professor at CMU:

"There's no danger of us running out of words. There's still about 100 million books to be digitized, which at the current rate will take us about 400 years to complete."


[BBC News]