Using Thermal Cameras to Track Hand Motions Could Be the Key to Interacting with Smart Glasses

Gif: Cornell University

If this whole smart glasses thing is going to effectively free us from having our heads constantly down and staring at our phones, we’re going to need a reliable way to interact with a virtual screen. Thanks to new research from Cornell University and the University of Wisconsin, Madison, we could still rely on our hands and fingers without actually having to touch a screen.


It’s a creation reminiscent of the touchless gloves Tom Cruise used to interact with the futuristic police computers in Minority Report, and it’s an approach that’s been used in real-life applications and research before. We’ve all developed muscle memories for quickly navigating and tapping out messages on a smartphone, and if you’re a touch typist, you can probably easily go through the motions in mid-air without having an actual keyboard to type on.

What’s kept hand-tracking devices away from consumers is that, to date, they’ve been rather clunky contraptions, relying on strategically positioned cameras, tracking markers, or even adhesive sensors that can detect electrical impulses as muscles in the user’s hands are activated. A different approach has been needed to make these devices practical for day to day use, and that’s what’s finally being demonstrated here.

Gif: Cornell University

The FingerTrack wearable trades optical cameras for just four low-resolution thermal cameras positioned around the wrist where you’d normally wear a watch. Those cameras are pointed towards the hand but don’t have an unobstructed view of the wearer’s fingers. Instead, the four cameras are continuously taking images of what they can see, which amounts to essentially an ever-changing heatmap blob representing the wearer’s hand from different angles.

To the naked eye, it doesn’t make much sense, but when the images from the four thermal cameras are merged into a single silhouette, machine learning can be used to interpret the blob and extrapolate the positions of each and every finger, in real-time, and with enough accuracy to distinguish between 20 finger joint positions. As nondescript as the heatmap images may appear, making a pointing gesture with one finger, for example, will always result in the same shape being captured by the cameras, which means that even if the thermal cameras aren’t perfectly aligned all the time, a machine learning model will still be able to use the heatmap imagery to know how the hand is posed, and how it’s moving.

The FingerTrack prototype definitely still looks like a prototype, and a device that few consumers would be willing to strap to their arms in its current form. But eventually, the technology could be integrated into a device like a smartwatch, which consumers have readily embraced. There are also other applications aside from enabling the future of wearables. The technology could also yield low-cost devices capable of automatically translating sign language making it easier for those reliant on their hands to communicate, to be able to talk to anyone. Tracking how the hands move, and how that changes over time, such as tremors developing, could also be used to diagnose and spot the early signs of conditions like Parkinson’s.


Share This Story

Get our newsletter



I’m a little confused: what benefit comes from using thermal cameras instead of just regular cameras? Either way, the resulting images need to be fed through a machine learning system to decode them, right?