<![CDATA[Gizmodo: speech recognition]]> http://tags.gizmodo.com/assets/base/img/thumbs140x140/gizmodo.com.png <![CDATA[Gizmodo: speech recognition]]> http://gizmodo.com/tag/speechrecognition http://gizmodo.com/tag/speechrecognition <![CDATA['Top Secret' Zumba Voice-Controlled Phone Looks Like Spy Gear, Smells Like Vapor]]> In this video, the BBC takes us inside the top secret headquarters of IA Technology, where the former ejector seat company is working on the "world's first fully accurate voice recognition phone," the Zumba.

Accompanied by a card-like carrying case/base station, the Zumba slips out into an earpiece shaped like one of those old fossilized spiral shellfish. The entire assembly sits on your head, and operates all functions, including texting, through voice commands.

Our lovely presenter can't tell us anything interesting about how this technology works (or more realistically, will one day work), other than that it is linked to some kind of cloud system, much like Google's iPhone voice app. She enthusiastically blames this on extreme secrecy, but it sounds more like second-hand PR speak to me. Also, her demonstration unit appears to be a dummy, and there's no sign that the touted speech recognition capabilities exist yet either.

The Zumba will apparently ship before next year. Or, IA Technology will explode into a massive cloud of unsent patent applications and investors' £100 notes. Either way, we have our eyes on you, Zumba. [BBC]

]]>
http://gizmodo.com/index.php?op=postcommentfeed&postId=5142628&view=rss&microfeed=true
<![CDATA[Hands-Free GPS Device for the Blind Could Make You a Superhero]]> The Navigation aid for the Blind headset is a GPS device, which not only works through speech recognition, but also uses obstacle detection technology that alerts the blind of any sleeping bums or other obstructions he could trip over as he is being guided to his destination.

In 2003, we reported on a GPS navigation device that led the visually impaired to their impending doom due to an "inaccuracy" of the system.

Although this new GPS device is not as cuddly as a guide dog, it is made up of one earpiece and microphone, which would allow the blind a certain anonymity, kind of like Daredevil, in that he would no longer need a cane or furry pet, which would leave both of his hands free ... to fight crime, perhaps? [create the future contest via gizmag]

]]>
http://gizmodo.com/index.php?op=postcommentfeed&postId=5060865&view=rss&microfeed=true
<![CDATA[AT&T Bringing (a Tiny, Frustrating Bit Of) Speech Recognition to the iPhone, Others]]> AT&T's Speech Mashups is a web-based service that will bring voice-activated search to the iPhone, as well as other Edge and 3G handsets. Instead of managing speech recognition on the actual handset, Speech Mashups sends the audio sample to the server, processes it and sends back a text transcription or command to your phone. Unfortunately for iPhone owners, this does not mean voice dialing or speech-to-text app support. Not at all.

AT&T is not currently planning to use this tech to manipulate current iPhone apps (Contacts? Maps? Mail?) but instead will deploy it in web services for a number of net-enabled handsets. This is a somewhat curious choice for AT&T, but it would be difficult to implement system-wide speech recognition without either modifying existing apps or running a (currently disallowed) background service to catch commands. Speech Mashups will be an interesting service for the other handsets it shows up on, but they already have simple voice commands. By building secondary voice capabilities like this for a phone without basic ones, AT&T has inadvertently highlighted one of Apple's most irritating restrictions on iPhone development. [Gadget Lab]

]]>
http://gizmodo.com/index.php?op=postcommentfeed&postId=5028357&view=rss&microfeed=true
<![CDATA[Direct Voxx Muso is Natural-Speech Voice Recognition Dongle for iPod nano]]> There are plenty of iPod cradles that let you remote control the device, some built-in to cars, but Direct Voxx has come up with the Muso that lets you do it by voice. It's an interesting bit of kit that doesn't require training to understand you, and lets you demand particular tracks, scan through playlists, pause and resume playing music just by speaking in natural language like "play California Dreaming by the Mamas and the Papas." Check out the video to see it in action.

Pretty impressive, and saves all that fiddling around with buttons when you should be busy controlling your car. It's got background noise suppression, so apparently it can cope with driving noise. And its independent battery runs it for 10 hours, without affecting the iPod.

There's just one flaw: its price. At $159 it's more than a 4GB nano itself, and that seems a little crazy. They are planning on releasing new versions for other iPods and the iPhone "as soon as possible," but this one will be out in December. [DirectVoxx]

]]>
http://gizmodo.com/index.php?op=postcommentfeed&postId=5025713&view=rss&microfeed=true
<![CDATA[ThePudding.com Phone Service Listens to Your Calls, Makes You Watch Ads]]> It sounds like a double-whammy of a bad idea: a free phone service that determines which ads to target to you by applying speech-recognition to all your conversations. To make things worse, the home page of ThePudding.com insults potential customers by saying it's "a breakthrough technology that makes your phone calls interesting." Hey, my phone calls are a thrill a minute.

Although it will offer service, ThePudding isn't trying to claim a piece of the pie that Skype, Vonage and the cable companies have been wrassling over for years. According to the AP, it hopes to "license its speech-recognition service to other companies that use Voice over Internet Protocol." But AP tech writer Peter Svensson had mixed results when testing ThePudding's speech recognition:

"Relevant ads appeared when this reporter talked about restaurants and computers, but the software was oddly insistent that he should seek a career as a social worker, showing multiple ads and links pointing to that field."
The description of the service inspires such Kubrickian paranoia, I could have just as easily used that classic image of Alex strapped to the chair, eyelids peeled back with clamps. Welcome to the future, my little droogies. AP]]>
http://gizmodo.com/index.php?op=postcommentfeed&postId=302929&view=rss&microfeed=true
<![CDATA[UPDATED: iPhone Speech Recognition Demo from VoiceSignal]]>
We've heard of VoiceSignal speech recognition for lots of other phones, but now VoiceSignal sent us a video that allegedly shows it working for the first time on the iPhone. According to the guy in the clip, a couple of VoiceSignal engineers designed this app, but all we see it doing so far is controlling music on the iPhone.

It's for real (see update below). Sure will be nice to be able to use speech commands with the iPhone, telling it to call so-and-so on those mic-equipped earbuds while keeping the phone in the pocket.

UPDATE: We got this exclusive info from Chris LeBlanc at VoiceSignal: "It works just like our other apps, so it's speaker independent and needs no training at all to recognize names, numbers and general speech. We will demo the continuous voicemode on the iPhone soon, I've already seen it — and it's nearly ready." Chris has promised us a demo copy, so we'll give you a first look as soon as it arrives. [VoiceSignal]

]]>
http://gizmodo.com/index.php?op=postcommentfeed&postId=292344&view=rss&microfeed=true
<![CDATA[Vonage Visual Voice Mail Hands On (Verdict: Mixed Success)]]> VoIP telephone service provider Vonage just began offering Visual Voice Mail, a text transcription service that turns all of your voicemail messages into text that's immediately emailed to you. Using a combination of speech-to-text software and human transcribers, Vonage is charging 25 cents per transcription, which could end up getting expensive if you have a lot of voicemails. We gave the service a try, with inconsistent results.

Our first test call was from a landscaping service that we used here at the Midwest Test Facility three years ago. See if you can decipher the meaning of this message:

"Good Afternoon this is linda for me well branded sign I'm i'm getting a hold of Truly whites residents or Company Evil goes out your neighborhood and he said that one you're properties looks like it could Use some pruning it's been three years now since we were out there And well wondering if you'd wanna a proposal from us Please give me a call At two six two two four four Nine four zero zero And let us know if that Is something you would like for us to do Thanks I'll wait for your call back up right of"
As you can see, just missing a few words can make the entire message unintelligible, turning the caller's organization into "Company Evil." Ha.

Then a second message came in, and this one fared a little better:

"Hey, it's Kim. I'm calling about lunch today. I was just calling to see if it was alright if we met at 12:30 instead of 12, 'cause I (??) have to work till 12 and I'm (??) but I can be there by 12:30, no problem. If that gonna be a problem for you, can you please give me a call back. Otherwise, I will see you at the (??)'s place at 12:30. I look forward to seeing you. I hope that's gonna be okay. Talk to you soon. Bye-bye."
Now you're talking. Except for those question marks where the software couldn't figure out what was being said, this worked out really well.

We wish the implementation was a little closer to perfect for these transcriptions, but the idea of having your voicemails delivered to you in text form is highly appealing. Imagine in a meeting, you could a quickly glance at the text of all your voicemails and immediately catch up with what's going on.

However, that $.25 price for each transcription is just not cost-effective enough. If you get a dozen voicemails a day, your monthly tab would hover around $90 for this convenience. We're thinking more along the lines of 5 cents apiece would make it more practical. We're canceling the service right away.

]]>
http://gizmodo.com/index.php?op=postcommentfeed&postId=282847&view=rss&microfeed=true
<![CDATA[Star Trek Tech Arrives in VoxTec Phraselator P2]]> We mentioned a Star Trek-like Phraselater translation device a few years ago here on the Giz, but now it's new and improved. The Phraselator P2 from VoxTec is a bulky-looking hand-held contraption that functions like the Tower of Babel in your hands.

Its maker says it's ruggedized, unphased by a fumble onto pavement or a torrential rainstorm, and can translate phrases you speak in English into any language, and then translate back into English whatever people say to you. Hmm, that's a lot of languages—perhaps that's an exaggeration. Does it work, and how expensive is this thing?

Just like the latest Version 9 of Dragon NaturallySpeaking, you don't have to train it to understand your speech, and any male or female can use it right out of the box. It was developed for military use, so it must have some high technology inside, but that gives you a clue why the thing is so expensive: $2000.

Even so, it looks like it would be fun to take this baby out for a conversation or two. But then, there's no word whether it'll be able to translate GuySpeak into WomanSpeak.

Translator Gadget: Phraselator P2 by VoxTec [Product Reviews Net]

]]>
http://gizmodo.com/index.php?op=postcommentfeed&postId=260196&view=rss&microfeed=true
<![CDATA[Logitech's Harmony Remotes to get Speech Recognition, Biometrics and Search Function]]> Just when you thought they couldn't cram any more features into their Harmony 1000, the folks at Logitech are planning to give their uber remote a trio of features that'll include speech recognition, biometric security and a built-in search function.

The news came straight from Logitech's reps, who confirmed that their next Harmonies will use IBM's ViaVoice recognition software so that you'll be able to change channels by barking out commands. It gets better though.

The remotes will also rely on fingerprint readers to load customized preferences for every person in your household. Lastly, Logitech is working on a search function that'll let you load up songs from your media library by simply saying the name of the band you want to hear. So in other words, "play Sinatra" would launch your Sinatra tunes. I tend to stay away from universal remotes 'cause of their price, but something like this might be worth the splurge.

Harmony Remotes to Include Speech Recognition, Search [PC Mag via Gadget Lab]

]]>
http://gizmodo.com/index.php?op=postcommentfeed&postId=258390&view=rss&microfeed=true
<![CDATA[SpinVox Winds Your Voicemails into Text For Easy Reading]]> We had a brief chat with a SpinVox co-founder today and he told us all about this speech-to-text service. SpinVox, when integrated with a cellphone or landline provider, can take your voicemail messages and automatically transcribe them into text that gets sent to your email or your phone as a text message.

This is actually a pretty cool service, seeing as other transcription services we've seen are either expensive or strange to use. SpinVox has lined up Cincinnati Bell and Skype, and are working on some deals with major carriers now (no details yet). We know many people who don't bother listening to voicemails because it requires dialing in, pressing buttons, and listening—these are very lazy people.

Other cool SpinVox usages are sending memos and broadcast messages from your phone by calling a number and speaking. Sounds like a great way to send messages to your husband to pick up some tampons.

Product Page [Spinvox]

]]>
http://gizmodo.com/index.php?op=postcommentfeed&postId=244592&view=rss&microfeed=true
<![CDATA[Microsoft Discovers Secret to Speech Recognition]]>

Microsoft [Thanks Eric!]

]]>
http://gizmodo.com/index.php?op=postcommentfeed&postId=237176&view=rss&microfeed=true