Crowdsourcing Could Help Deaf People Subtitle Their Everyday Life

Illustration for article titled Crowdsourcing Could Help Deaf People Subtitle Their Everyday Life

Subtitles make TV far more accessible for deaf people, but new research promises to give people with hearing difficulties the option to subtitle their everyday lives, too, using crowdsourced transcribers.

Researchers from the University of Rochester have developed an app which allows deaf individuals to read subtitles that correspond to what's happening to them, in their day-to-day lives. The app, called Scribe, beams an audio track from the user's phone to a central server.

From there, the system recruits workers from Amazon's Mechanical Turk crowdsourcing service. Each worker then hears the full audio stream from the user's phone, and is asked to transcribe what they hear. To make sure the results are accurate, the software uses some neat tricks, as New Scientist reports:

All workers hear the full audio stream but the volume of different sections is raised and lowered, encouraging each person to focus on transcribing a particular part. Scribe then combines the partial transcriptions with software normally used to align evolutionarily related sequences of DNA. Bigham modified the software to account for common typos based on the layout of a keyboard - for example, if someone types "fqll", it is more likely they mean "fall" than "fill", because "a" is nearer to "q" than "i" is. The software then chooses the words that a majority of the workers have typed.


Pitted against professionals, Scribe is 74 percent accurate compared to 88.5 percent from a trained stenographer—but at a fraction of the cost. While the app's still in development, the team behind it hope to have a beta version ready for release soon to see how it copes in the wild. [University of Rochester via New Scientist]

Image by dno1967b under Creative Commons license

Share This Story

Get our newsletter



This is cool, but...there's no way it's real-time. The Mechanical Turk camera took several minutes (or was it more like 10 minutes?) to process. So what would this be useful for? Many of family members are hearing impaired, and I can't think of a use for them to record something, then 10 minutes later read what they heard. What's the point?

Now, if it were closer to real-time, that would be cool, but with the delay? Kind of useless.