Soon We Will be Able to Design Custom Sounds with Voice And Gesture

The first thing an architect or graphic designer will do at the start of a project is to produce some preliminary sketches — just to rough out their ideas on paper, perhaps augmented with computer-aided design software. But sound designers don’t have similar tools. A consortium of European researchers is seeking to change that by developing a suite of sketching tools for sound, based on voice and gestures.

“If you are an architect and want to sketch a house, you can simply draw it on a sketchpad,” the researchers wrote in a summary of their work. “But what do you do if you are a sound designer and want to rapidly sketch the sound of a new motorbike?” The usual tools — synthesizers, samplers, and sequences, for instance — are complicated and require considerable training to use. They’re just not as simple, quick, and intuitive as a sketch pad.

Sound is difficult to describe in words, which is why most of us resort to a combination of gesture and vocal mimicry when, say, trying to convey to someone else that a car goes vrooom. The human voice is like a built-in sound synthesizer.

“People can recognize fairly well what a person imitates,” Guillaume Lemaitre, a researcher at Ircam in Paris, France, told Gizmodo via email. “So our dream tool would be a synthesizer that we could directly interact with, [using] our voice and gestures, just as what we do naturally when we talk to someone. Ideally, this synthesizer would understand the imitations the same way a person would do, and create sounds accordingly.”

That’s the goal of SkAT-VG (Sketching Audio Technologies with Voice and Gestures), a three-year interdisciplinary collaborative project between four partners. Ircam is responsible for aspects involving perception psychology, gesture analysis, signal processing, and machine learning. The Royal Institute of Technology (KTH) in Stockholm, Sweden, is handling the phonetics, while Iuav University of Venice, Italy, focuses on sound design and sound synthesis. And Genesis, a company based in Aix-en-Provence that conducts sound studies and develops audio technologies for sound design, is in charge of user studies and prototype integration.

The first step is gaining a better understanding of how people use mimicry and gesture to communicate different sounds. So Lemaitre and his Ircam colleagues rounded up 50 volunteers and had them listen to recorded sounds, then imitate those sounds. There were mechanical sounds (like tapping and scraping), sounds of common objects (cars, blenders and saws) and also computer sounds, like sound effects in video games. All the participants were filmed with a GoPro camera, and fitted with a body-tracking kinect and accelerometers attached to their wrists. They also captured the process on video:

Lemaitre admits that they had some misconceptions going into the study. For instance, “We initially thought that people would draw the trajectory of some acoustical features — like pitch or the intensity — with their hands in the air, like raising your hand to imitate pitch going up,” he said. But this proved not to be the case. Instead, gestures were used more for emphasis, in a metaphorical fashion stereotypically associated with Italian characters in film and television. “They seemed to be more like symbols that indicate certain overall properties of the sounds,” Lemaitre said.

Based on that, he and his colleagues concluded that gestures would not be particularly useful as a means of precisely controlling the behavior of a synthesizer in real time, as the consortium members originally thought would be possible. Vocal imitations are far more effective for that purpose. “Voice can reproduce accurately higher tempos than gestures, and is more precise than gestures when reproducing complex rhythmic patterns,” according to Lemaitre’s summary.

The next step is to build actual prototypes of the sketching tools, based on what’s been learned so far, and test how well they work in real-world conditions. Lemaitre said the consortium will hold a special event this spring in the south of France, specifically for sound designers, giving them the task of creating specific sounds with the prototype tools and evaluating the pros and cons of the prototypes.

Practical uses aside, Lemaitre thinks studies of vocal imitations and gestures might also prove beneficial for neuroscientists interested in auditory perception and cognition. Studies like the one above could improve our understanding how sounds are encoded in memory.

Reference:

Rocchesso, D., Lemaitre, G., Susini, P., Ternström, S., & Boussard, P. (2015) “Sketching Sound with Voice and Gesture,” Interactions 22(1): 38-41.

[Via Acoustical Society of America]

Image: View Apart/Shutterstock

Soon We Will be Able to Design Custom Sounds with Voice And Gesture

Sign up for our newsletters

Latest news

The Most Expensive Dinosaur Fossil in History Is Now a T. Rex Named Gus

Anker 6-in-1 Thin Power Strip Is a Steal in Amazon’s Final Clearance Run, With USB-C and USB-A Ports Included

JBL Go 5 Sees Early Price Cuts to a Record Low on Amazon, Now Cheaper Than Its Previous-Generation Go 4

Nvidia’s Finally Making Gaming on ARM PCs Viable

Pebble Will Replace Cracked Screens on Its Revived Smartwatch

Forget Breville Espresso Machine, Keurig K-Duo Brews Both K-Cup Pods and Ground Coffee and Hits Its Lowest Price Yet

New Startup Is Building Mini Drones to Wage a War of Total Destruction on Mosquitos

Samsung’s New Foldables Will Fix the Worst Thing About Their Screens

Latest Reviews

The Best Budget Laptops Under $1,000 for Back to School

Roborock Saros 20 Review: Jack of All Trades, Master of Most

You Know What Your Bathroom Needs? A Smart Mirror With Party Lighting

Narwal Freo Z10 Turbo Review: Midrange Vacuum, High-End Performance

X by Xreal a01+ Review: AR Glasses That Are Light on Your Face (and Wallet)

Razer Blade 16 (2026) Review: A Gaming Laptop You Can Actually Call ‘Portable’

Lenovo IdeaPad Slim 5x Gen 11 Review: Solid ARM at a Budget Price

Nothing Ear 3a Review: You Can Skip the Flagship

Related Articles

Soon We Will be Able to Design Custom Sounds with Voice And Gesture

Sign up for our newsletters

The Most Expensive Dinosaur Fossil in History Is Now a T. Rex Named Gus

Anker 6-in-1 Thin Power Strip Is a Steal in Amazon’s Final Clearance Run, With USB-C and USB-A Ports Included

JBL Go 5 Sees Early Price Cuts to a Record Low on Amazon, Now Cheaper Than Its Previous-Generation Go 4

Nvidia’s Finally Making Gaming on ARM PCs Viable

Pebble Will Replace Cracked Screens on Its Revived Smartwatch

Forget Breville Espresso Machine, Keurig K-Duo Brews Both K-Cup Pods and Ground Coffee and Hits Its Lowest Price Yet

New Startup Is Building Mini Drones to Wage a War of Total Destruction on Mosquitos

Samsung’s New Foldables Will Fix the Worst Thing About Their Screens

The Best Budget Laptops Under $1,000 for Back to School

Roborock Saros 20 Review: Jack of All Trades, Master of Most

You Know What Your Bathroom Needs? A Smart Mirror With Party Lighting

Narwal Freo Z10 Turbo Review: Midrange Vacuum, High-End Performance

X by Xreal a01+ Review: AR Glasses That Are Light on Your Face (and Wallet)

Razer Blade 16 (2026) Review: A Gaming Laptop You Can Actually Call ‘Portable’

Lenovo IdeaPad Slim 5x Gen 11 Review: Solid ARM at a Budget Price

Nothing Ear 3a Review: You Can Skip the Flagship

Related Articles

The Best Budget Laptops Under $1,000 for Back to School

The Best Tech to Level Up Summer 2026

Don’t Be Afraid of Self-Improving AI, Says a16z-Backed Startup Mirendil

Nobel Prizes: 5 Unlikely Winner Reactions, From the Unbothered to the Downright Mad

An Artist Claims to Have Created Paint in a ‘New’ Impossible Hue Conjured by Scientists

Scientists Agree That Everyone Hates Your Terrible Zoom Mic