The not-quite human voice that emanates from your phone or GPS or other device is, more often than not, female. It's an obvious pattern, and one that many have claimed has a simple technical explanation: Female voices are easier to understand. The only problem is that it's not true.
Last week, Annalee Newitz wrote about the creepiness of our feminized digital assistants. This led to some pretty heated discussions about why AI personalities like Siri, Cortana, and Alexa are given female voices in the first place. There are of course many cultural factors at play, when it comes to our affinity for certain voices, but many people suggested that there was a scientific basis to the choice. The claim was that women's voices are, innately, easier to understand.
This is a myth, and there are several variations on it. Let's go through them one by one. The science of how hearing works and how speakers work is surprisingly fascinating, and not as straightforward as you might think.
There's no evidence that the fundamental frequency of speech—how high or low it sounds—has a direct role on how intelligible it is, according to a well-cited and comprehensive study from the University of Indiana.
But what about people with hearing loss? As we age, our nerves dull and the tiny bones inside our ears gradually change, shrinking the range of frequencies we are able to perceive. In fact, it's the high-pitched frequencies that we usually lose first. The cumulative effects of hearing loss are most pronounced in the elderly, who often have trouble understanding women and children; but most of us – even spry twenty-somethings – suffer some degree of hearing loss. Hence the existence of the Mosquito, a "teen repellant" that emits a high-pitched buzz few people over the age of 30 can hear. Teens have also capitalized on this concept, using high-pitched noises for ringtones adults can't hear.
So to recap, there's no evidence fundamental frequency plays a role in intelligibility. Even if it did, it stand to reason that low-pitched voices would be intelligible to a wider range of listeners than high-pitched ones.
As the astute among you have surely recognized, male and female voices differ in more than their fundamental frequency. Men and women also talk differently. One important difference, again according to the University of Indiana study, is that women say their vowels more distinctly.
In speech scientist jargon, women have "larger vowel spaces." As Lori Holt, a psychologist at Carnegie Mellon University explained to me, it means that a cluster of words like "hot, heat, hoot" tend to sound more distinct when spoken by women.
The Indiana study found that female speech does tend to be tiny bit more intelligible – but then, speakers with larger vowel spaces are also more intelligible, regardless of their gender. The difference in larger vowel spaces may be responsible for the difference between male and female intelligibility.
That means that, yes, the voices of real, live women might be slightly easier to understand. But there is a lot of vocal variation within a group of women or within a group of men. If you're giving a voice to a digital assistant, it makes sense to pick one with great enunciation, whether it's male or female.
This much is true: Tiny speakers are crummy, especially when it comes to low frequency sounds. That's why your phone sounds so tinny in speaker mode. Chris Kyriakakis, director of the USC Immersive Audio Laboratory and CTO of Audyssey explained it an email to Gizmodo:
[Tiny speakers] are incapable of reproducing low frequency sounds. In order to reproduce bass the speaker has to push a lot of air and the transducers are so small that this is just not possible. They are already being pushed to their limits and would be damaged if they were asked to reproduce bass content. So, the device makers limit the low frequency content that is sent to these speakers to avoid damaging them.
But do these factors matter when it comes to reproducing the human voice? As Kyriakakis went on to explain, tiny speakers actually affect female and male voices equally.
The frequency response of these tiny speakers typically starts above 500 Hz, which is well above the fundamental frequencies of both male and female voice...A typical male voice has a fundamental frequency between about 80 Hz to 150 Hz, while a typical female voice fundamental is in the range of 160 Hz to 250 Hz. Of course, human voices also have harmonics at higher frequencies and those are what we hear reproduced by the speakers. The missing low frequency content is roughly equivalent for male and female voices.
In other words, sounds below 500 Hz are missing from these tiny speakers, period. Switch from speaker to non-speaker mode on your phone, and you'll notice the voices in speaker-mode sound higher. This is why, and it affects both male and female voices.
A oft-cited reason for Siri's femaleness is the persistence of history. The first voice navigation systems to become widely used were in the cockpits of WWII fighter planes, where female voices supposedly stood out against the low rumble of engines.
More recently, though, a 1998 study at the Wright-Patterson Air Force Base in Ohio found the opposite: It's actually female voices that are less intelligible against the noise inside cockpits, though the difference was tiny and only statistically significant at the highest levels of noise.
Generally, how background noise affects intelligibility is variable. Low-pitched voices will stand out against high-pitched noise, and vice versa. John Neuhoff, psychology and neuroscience professor at the College of Worcester, elaborated in an email on why he thinks it shouldn't make a difference whether a voice is male or female.
My intuition is that most environmental noise is broad enough in its frequency spectrum that neither type of voice would have an advantage. However, there are always exceptions. I don't think the exceptions would favor one over the other. Sometimes a male voice would be better, sometimes a female.
And there you have it. In all, there is no convincingly technical argument for why Siri and her digital, um, sisters must have female voices. Of course, there's plenty of social reasons why we might prefer a female digital assistant. But that's a different can of worms for a different time.
Top image: Sergey Nivens/shutterstock