Podcast #694: The Fascinating Secrets of Your Voice

Unless you’re a complete recluse, you probably use your voice many times a day, whether talking to your spouse, chatting with co-workers, or singing along to music in the car. Yet, you’ve probably never thought all that much about something that’s literally happening right under your nose.

My guest today says that once you do start thinking about your voice, it reveals fascinating secrets to who you are. His name is John Colapinto and he’s the author of This Is the Voice. John and I begin our conversation with what exactly the voice is, how the voice develops in babies, why men and women speak in lower and higher voices, and what each sex finds attractive in the voice of the other. We then discuss why people develop accents, and how these accents set boundaries as to who is in and who is out of a group. We dig into the modern phenomena of vocal fry and uptalk, and how, when you end everything in a question, it can sound like you’re a submissive supplicant. We get into how singing makes us feel super vulnerable, and why modern pop music can sound soulless when its inherent imperfections are stripped out. We end our conversation with the way our voices degrade as we age, and John’s call to own and use your voice.

If reading this in an email, click the title of the post to listen to the show.

Show Highlights

What is our voice?
Why are humans the only animals that have a voice?
When did humans start speaking? How did that change our thinking?
How does a baby go from babble to language
Why men and women have a different octave of voice (unlike every other mammal)
Why do we have accents?
What’s with “up talk” and “vocal fry”? Why are younger people in America speaking like this?
The fascinating ability of humans to pick up on subtle emotions
Why is it that singing makes us feel so vulnerable?
Why “perfect” music is actually disconcerting to listen to
What happens to your voice as you age

Listen ad-free on Stitcher Premium; get a free month when you use code “manliness” at checkout.

Podcast Sponsors

Click here to see a full list of our podcast sponsors.

Read the Transcript

Brett McKay: Brett McKay here, and welcome to another edition of The Art of Manliness Podcast. Unless you’re a complete recluse, you probably use your voice many times a day, whether talking to your spouse, chatting with co-workers, or singing along to music in the car. Yet, you’ve probably never thought all that much about something that’s literally happening right under your nose. My guest today says that once you do start thinking about your voice, it reveals fascinating secrets to who you are. His name is John Colapinto. He’s the author of This is the Voice. John and I begin our conversation with what exactly the voice is, how the voice develops in babies, why men and women speak in lower and higher voices, and what each sex finds attractive in the voice of the other. We then discuss why people develop accents and how these accents set boundaries as to who is in and who is out of a group. We then dig into the modern phenomena of vocal fry and uptalk and how, when you end everything in a question, you can sound like you’re a submissive supplicant. We get into how singing makes us feel super vulnerable, and why modern pop music can sound soulless when its inherent imperfections are stripped out. We end our conversation with the way our voices degrade as we age, and John’s call to own and use your voice. After the show’s over, check out our show notes aom.is/thisisthevoice.

John Colapinto, welcome to the show!

John Colapinto: Thank you so much!

Brett McKay: So you are an author of a new book called This is the Voice, where you explore the human voice: The physiology of it, the history of it, the culture of the human voice. What caused you to take a deep dive into the science and anthropology of our voice?

John Colapinto: Yeah, it was, in some ways, a long time coming. I think I really needed the initial spur of 20 years ago injuring my voice in a permanent way. I was actually, at the time, a staff writer at Rolling Stone magazine. Jann Wenner, the owner and editor was putting together a rock band. I was tapped to be the singer because it got around the offices that I could at least hold a tune. But I had never done proper vocal warm-ups, I didn’t know how to rehearse. I over-sang horribly, like way too loud at rehearsals. And the gig itself was scary, 2,000 people, lots of celebrities in the audience, and I sort of overdid it there as well. And ended up with this rasp in my voice, which was eventually diagnosed by a laryngologist, as they’re called, as a vocal polyp. And it’s effectively a bleed in one of your vocal cords from overuse that becomes like a bump of scar issue on one of your vocal cords.

So that got me thinking, all of a sudden, I couldn’t take my voice for granted, couldn’t sing anymore. But actually, 10 years later, when I did a story on a vocal surgeon who had saved Adele’s career, she had a polyp much like mine. He was the guy that said to me, “You know, this is messing you up more than you’re acknowledging. You are limiting your emotional range,” because we do emotion in voice with pitch changes up and down. And he said “You’re speaking in kind of that one register in order to sound a little smoother. You’re not projecting who you are because you’re sort of coming across as this raspy voice, like bourbon-swilling, cigarette smoker,” which I’m not anymore and haven’t been for years. So he put it in my head that even something as tiny as a little bump on one vocal cord is changing how you behave, how people perceive you, how you sound. I stopped speaking as much. I’m an extrovert, I stopped talking as much ’cause I would be extra raspy. And this really was the beginning of thinking about writing a book that really looked at the whole wide scope of what our voices are doing that we don’t really acknowledge.

Brett McKay: Alright, so this is gonna sound like a really basic question, but actually, this is really profound. What exactly is our voice? What is it?

John Colapinto: Oh, such a… No, that’s the $64,000 question. Absolutely, because we just assume, “Oh, yeah, voice! It’s right under our noses, literally.” But we don’t think about the fact that it actually is a signaling and communication system that is transmitting emotion, but also, of course, language, you have the linguistic layer. It’s telling people something about where we come from, from accent, and so on. It can even communicate sexual orientation if someone has a very strongly gay voice, as in the guys in whatever it’s called, Queer Eye for the Straight Guy, or something. But it’s not enough just to talk about what the signal is doing. When you think about how we’re doing it, you suddenly realize, “Wait, there’s no single vocal organ.” The vocal cords just create a buzzing sound.

That we actually then have to sculpt into speech by moving our articulators, lips and tongue. But we’re actually powering the vocal cords with air that we’re pushing up from our lungs. And we take that for granted, but chimpanzees can’t do it like we do. We have to draw out our exhalations. We actually hold back in order to breathe out for a long time to string words together. Chimpanzees go…

Because they can just do little short bursts. So we’re actually… It’s like all these different body parts are acting to create this signal that is so complex. So what is the voice? Is it singing? Is it just talking? Is it coughing, laughing? It’s all of the above.

Brett McKay: And it’s not just physiology; it’s a cognition, it’s a mental… It’s in the mind as well.

John Colapinto: Oh, yeah, it’s… That’s exactly it. It’s all, everything that I just described is, of course, controlled by our brain. And we don’t think of it this way, but speech, talking, is a physical gesture. You’re literally moving body parts with exquisite control and precision, and hitting targets in your mouth with your tongue tip, either to say a T, or with the back of your tongue to say a ‘kuh’. The stuff that we’re doing that’s just motor control, again, is all controlled by the brain. But then, we’ve got that layer, that high executive cortical layer that’s getting us language, too. So we’re putting together words and putting thoughts into words, which we then get out through our vocal apparatus. So there is so much going on up in that brain of ours with voice.

Brett McKay: So let’s go back, kind of hash out some more of the physical aspects of voice. Humans are the only animals that can speak with a voice. Why can’t our primate relatives? What is it about their physiology that’s preventing them to have a voice?

John Colapinto: Yeah, you know, it’s fascinating, it’s a combination of, exactly as you say, their physiology. It’s literally where their larynx, which is the thing that holds our vocal cords, it’s like the voice box, and men can see theirs ’cause it’s where the Adam’s apple is, that point is literally where your vocal cords join, and then extend backwards towards your spine. But anyway, ours are about mid-neck. If you actually look at a chimpanzee, if you dissect one, you discover that its larynx and vocal cords are right up in the back of its mouth, just under the soft palate at the back of the mouth. What that does is it eliminates the sort of throat resonating chamber that we have. There’s that vertical part of our throat that actually acts to amplify sound and to filter it. And we actually do that also with our mouth. So that’s how we do vowels, is where I’m going with this. And vowels are absolutely critical to speech. If, like a chimpanzee, you can basically only make the vowel sound, ‘a’.

Which is about what they can do with the mouth chamber, you can’t say a sentence like, “Who hid the hat in the hot hud?” So I’m just using an H and D pair of consonants, but I’m putting a different vowel inside each one and saying, “Who hid the hat in the hut?” Now, that’s literally a major aspect of how language is even possible. So chimps don’t have that, but they also lack the sort of parts of the brain, or at least as far as we know, actually. This is actually, I guess, something that we can never know for certain. They seem to lack those parts of the brain that comprehend words as sound, which they can then try to put out through their vocal apparatus. But again, yes, their bodies just aren’t outfitted for speech.

Brett McKay: I thought that was interesting, you highlighted the point that having our larynx down lower actually makes us more susceptible to choking.

John Colapinto: Yeah, that’s an amazing… And actually, Darwin was the first to notice that. Because our larynx is so low, it actually brings the opening to our lungs right beside where the opening to our esophagus or food pipe is. So every time we swallow, food has to pass across this very dangerous opening to our lungs, which is where our vocal cords open and close. And so people have been choking to death for centuries because of a mistimed swallow. The Heimlich maneuver has helped with that. But Darwin said, “This is totally against everything I know about natural selection, which is supposed to increase our chances of living, rather than increasing our chances of dying.” And he saw that our larynx was lower; he just didn’t know why. It was in the 1950s and ’60s that people discovered that throat resonance chamber that I mentioned, and its importance to creating vowels.

And a wonderful linguist and scientist at Brown University named Philip Lieberman said, “Hey, man, the reason that larynx is so low is it actually… Yes, it increases the chances of dying, but it also improved or it created our ability to speak, which just outweighed the dangers of choking,” evolution being kind of a balance between things that are advantageous and disadvantageous. Being able to speak as a primate that could not run as fast as a leopard, or wasn’t as strong as a bear or whatever predator was there. Our ability to speak was critical, not only to our surviving, but to shooting to the top of the food chain because by speaking to one another, we were able to make plans and we could outsmart these bigger, faster, more lethal predators. So the descended larynx, this larynx that literally moved down our neck over the course of evolution was a surprisingly critical thing for our species.

Brett McKay: And do we know about when that happened? When did humans start using their voice? And then I guess the question would be like: How did the act of using the voice change our thinking?

John Colapinto: Yeah, well, I guess it’s believed about 500,000 years ago. We first had to stand upright from being primates that were knuckle-dragging knuckle-walkers. And in standing upright, it’s theorized that that, partly, is what started to literally pull the larynx down in the neck. Interestingly, though, it probably just continued to descend even before we had language. And that’s because it gave an advantage in making us sound more threatening when we’d make a deep voice-like growl or grunt. Because we have that throat resonance chamber, it actually sounds more threatening. And it’s a size bluff, you sound bigger and more threatening. So it probably was offering an advantage to this rather weak and slow creature, this primate. But as we crossed over about six million years ago when we departed companies from chimps and we started to become this human species, it was really, again, about 400-500 thousand years ago, it’s theorized, that the larynx was in a position that permitted vowels and that we had the motor control and speed of tongue and lip movement that gave us the ability to actually say stuff. How that affected our brains is fascinating.

This guy, Lieberman, who I mentioned, departs company from almost every other scientist on earth at the moment, who follows Noam Chomsky’s idea that language began as thought, which in many ways, is a bizarre thing to claim. Thought? Not communication? As you pointed out, other animals have been communicating with their voices, threats and mating calls and so on since time immemorial. We were obviously doing that in our primate past. The idea that language didn’t evolve from vocal sounds is sheer insanity. And Lieberman thought so, and he spent 50 years looking at how that descending larynx and various other changes, literally, to our genetic make-up, which they’ve discovered, was actually what created language. In a sense, language followed the voice in a feedback loop. It sounds weird. How would that work? Well, we’re constantly feeding back into our brains through bodily movements like with our hands and digits; that’s when we started making fire and making tools. Our brains got smarter with more sophisticated abilities to move our bodies. That includes the vocal cords, tongue, lips. And in Lieberman’s beautiful conception, which has been proved by a lot of genetic evidence lately, it’s almost certain that language followed the voice. Talk about not giving the voice enough sort of importance in the world of science these days. It’s critical that we change our thinking on this.

Brett McKay: Yeah, that was interesting, it makes thought a very physical act, right? Yeah.

John Colapinto: Yes, yes. And that, too, Lieberman was obsessed with because he actually discovered that… He sort of traced thought to the earliest movement of mollusks and stuff with particular motor pathways. Now, on the one hand, that sounds totally bizarre. On the other, when you think about thinking, there’s so much movement involved. You feel your thoughts go from A to B to C to D. You navigate through a mental space in order to assemble ideas. A lot of it happens unconsciously, but a certain amount, in fact, a lot of it, is something that we actually feel as movement in our brains, I believe.

Brett McKay: Alright, so our voice changed the ways we think. It gave gave rise to language, and allowed us to be Homo sapiens. And let’s talk about how, on an individual level, a baby goes from basically sounding like a chimp. They just can wail, “Wah! Wah! Wah!” That’s it, to the point where by year two, year three, they’re saying full sentences. What goes on?

John Colapinto: Yeah, well, stunningly, one of the things that goes on… Maybe I’ll start with them actually still in the womb. Babies are actually learning language from the minute their hearing is in place, and that happens at about 28 weeks gestation as a fetus. They can hear their mom through her abdominal wall. But they also pick up a lot of her voice signal through what’s called bone conduction. As she speaks, her whole skeleton and her musculature vibrates, and it carries it down into that amniotic fluid that surrounds the baby. So that that fetus is actually getting its first stimulus from a human voice that is vibrating against the soles of its feet, its legs, its neck, its face. So if you really think about it, we are absorbing language in this like as this captive audience. And what babies are picking up at that stage, they’ve learned, is the stress patterns of particular languages. We stress either a first or second syllable more often in English than in French and so on. So they’re born with a basic, very basic understanding of how their language sort of sounds. It’s not great just because it’s kinda muffled in there. Once they’re born into the world, they’re suddenly in a bath of people speaking all around them.

And they are, we’ve discovered through these incredible scientific experiments, actually starting to distinguish what the particular sounds of our native language are. We’re born, science has discovered, able to hear every single sound in the world’s 6,000 languages. That includes the pops and clicks of certain African tongues, and babies can distinguish between them. They can distinguish between R and L sounds, which are quite close in certain languages. And so, they’re doing all of this kind of constructing of the sounds, but then, when they move to the part where they have to speak at around one year old, when they start to sort of… Well, they start babbling before that. The astonishing thing is they’re born with a larynx that’s in the same position as a chimp’s, as we talked about. It’s literally up at the back of the mouth because they breastfeed and they have to be able to breathe through their nose without coming up for air, and the milk flows around that raised larynx and into the stomach. But literally, over the first like three to six months of life, that larynx inches down the neck, they get on to solid food. Soon, they’re swallowing over the opening to their lungs as well. And their larynx moves basically where an adult’s is. It still has a way to go, but they start to be able to make vowels well enough that we can understand them.

One other thing I can’t resist mentioning that they have to do, we think we speak with a little space between each word, almost like words on a page. We don’t. It’s a constant ribbon of sound. Babies just hear that ribbon. How do they know where words begin and end? One way they know is by running a statistical analysis on the sounds that they’re hearing. So if you’re a Polish baby, you hear a Z and B sound together, like Zbigniew, the name Zbigniew. But if you’re an English baby, you never hear that combination within a word, but you do hear it across a word boundary. If someone said, “Leaves blow,” you’ve got that “zb,” but it’s going across a break in the words. Babies listen for the statistical preponderance of those sound and sound combinations, and they literally realize, “Oh, that, I should split this. Because I’m an English baby, I’ll split the words there. That’s probably a word.” And there’s other techniques they use. Science is amazing in having discovered this, where they are figuring out what separate words are, and then they figure out what those words mean, and then they move into learning grammar and eventually, they’re speaking.

Brett McKay: And to hit on back on this point of how our voice shapes our cognition, you highlight there’s… We have instances where if it’s… There’s a certain period of time where a child needs to learn language, and if they don’t, it messes them up for the rest of their life. And there’s that movie, that Jodie Foster movie, Nell, where that happens.

John Colapinto: Oh, yes!

Brett McKay: So what happens to children who don’t learn how to speak?

John Colapinto: Yeah, you know, what happens is there’s what they call a critical period or a window in earliest childhood where certain behaviors are learned then or they’re never learned. So what’s literally happening is the baby is building a brain that is capable of language by hearing language. I mentioned that we’re hearing the speech sounds of our particular language, B, K, Ah, A, E, I, O, U, and so on. With repeated hearing, the baby actually builds neural pathways, I.e., links together neurons in the brain that represent those sounds. And so if they keep hearing the, ‘puh’, ‘puh’, ‘puh’ sound, they literally… And it’s interesting how this works. The electrical impulses that flash along the neurons actually build a layer sort of like on a copper wire, the insulation, it’s called myelin. It’s like a protein that’s built along the nerve, but which speeds up the way the neurons flash across each other and represent that ‘puh’ sound.

But if they don’t hear ‘puh’, that connection dies away. They literally don’t happen. So within that period, that critical period, you build a language-capable brain that both hears the sounds and also builds the neuro-motor pathways for saying them, and also sort of the musical movement of language that links things in grammatical structures. And you sort of have this period where if you don’t learn that stuff, you’re not going to. And we actually know this, in a way, because babies that are raised in bilingual environments, they effortlessly know French and English. But a baby raised just in English, if he… When he goes to high school, he’s gotta sit there and go, “Je suis, tu es, il est, nous sommes”

He’s learning by rote French and not learning it because it’s so damn hard to learn a foreign language when your brain has been sculpted for your native tongue. So that’s literally what’s happening.

Brett McKay: But I think it’s interesting, too, it raises the point, the question like, “What is the voice?” ‘Cause children who are born deaf, they are still able to get that language thing going if they learn sign language, but I imagine it follows the same pattern, except you’re just using your hands.

John Colapinto: Well, that’s… Yeah, ’cause people do ask that. They say, “Hey, wait a second. If the voice is so important… ” But we have to remember that our species obviously is endowed with a talent, a special skill for language. If those sort of standard ways of communicating have been shut down because of some malfunction in the hearing, yes, then you use another… ” I mentioned before that speech is a gesture. Well, hands gesture, of course. And it’s astonishing and wonderful how human beings start to gesture with their hands. Deaf babies, they’ve discovered, babble with their hands. They literally, in that stage before speech in a hearing baby is in place, but they’re practicing by saying, “Gah, gah, gah, gah. Bah, bah, bah, bah.” You know, deaf babies are doing something similar with their hands. It’s wonderful.

Brett McKay: Alright, so yeah, there’s all… There’s a voice there. Even if you can’t… You don’t use your larynx to make a voice; you still have a voice.

John Colapinto: Yeah, I guess that’s a good way of putting it, absolutely.

Brett McKay: So another thing you explore in the book about our voice is the difference between male and female voices. And you point out in that in… Humans are the only, pretty much the only species where there’s dysmorphia where there’s a big difference between genders and how they speak and their voice. Can you tell us a little bit about that?

John Colapinto: Yes, absolutely. We, as… Men, I should say, speak, on average, about a full octave below women, which is actually quite stunning because think of your dog or your cat or a bear, any mammal you care to name, they don’t… You cannot tell them apart sexually. The dimorphism is… They’re called monomorphic voices and species. We are a dimorphic species and strongly so with our voice. And so that raises the question, “Why? Why on earth would that be? Why did the male voice go low and the female high?” And the best that science can tell us at this point, it seems highly likely to me, is to look at stuff that Darwin said, actually, about mating behaviors. And he pointed out brilliantly that… We’re used to this idea that mating is around this idea of attraction to the way the other, the mate looks and sounds and moves and so on. So we think of a peacock with its feathers that seduce, and we think of a bird with its beautiful bird song.

Now, with human beings, a female might be looking at a male and listening and realizing, “Oh, there’s sort of a deep voice there.” Now, voices are deep because of testosterone; it literally makes our vocal cords bigger and thicker and slower-vibrating to give a lower voice. So over evolutionary time, women, females figured out, “Oh, if he’s got a deep voice, it means he’s got a good, steady shot of testosterone. He’s probably strong and capable and as aggressive as he might have to be in order to win food for me and the baby. And furthermore, I’d like to pass that along to my baby. All good.” But it’s not the only way that our voices or how mating works because there’s also this thing called contest competition that Darwin talked about, whereby men or the male of the species, usually, have to fight each other for the favors of the female. And that also drove male voices down, down, down because, as I mentioned earlier, a deep voice is a threatening sound. It can be just a bluff, or it can be… It can really mean business like, “Boy, when the voice goes down there,” people know, “Okay, it’s serious.”

So one of the things that’s super interesting is it’s theorized that females like a deep voice in men, but not too deep because it means the man is overandrogenized or overtestosteroned. And that might mean that he’s kind of, let’s say like kind of rape-y, maybe he’s kind of like trying to find mates on the side, maybe he’s not gonna stick around to raise the baby. So it’s theorized that human male voices sort of ended up where they are, which is low and “lots lower than a female”, but not as low as a gorilla because females did not choose the super deep voices. And so a slightly more mid-range male voice was propagated in our species through reproduction, and that’s how we have ended up where we are as males with these voices.

Brett McKay: Yeah. And you also, men… There’s also an attraction. Men are listening to female voices for attraction, and I guess the two things are its breathiness and a higher pitch.

John Colapinto: Yes, and I don’t wanna leave this out. I got in trouble on another podcast or possibly interview where I think people said, “Yeah, you didn’t talk about women’s voices.” And they were right, I didn’t get that opportunity, so I’m glad you’ve asked. Yes, it’s, again, theorized, we can never know for sure. But men, through testing, college-aged men, find the higher voice in a woman more attractive as a mate. They statistically say that’s more attractive, and they also like a slightly whisper-y edge to the voice or a… It’s almost hard to describe what this sound is like. Now, the reasons why, we believe, is that the higher voice suggests that the person is fertile because when women have menopause, their voices deepen. So a higher voice in a woman is suggestive of reproductive health. What’s that whisper-y edge? Well, fascinatingly, at puberty, women’s voices change in a particular way, whereby the vocal cords don’t quite meet fully at the back of the vocal cord. Our vocal cords make sound by vibrating against each other. Now, they have a little gap where some of the air from the lungs sort of whispers through in a slightly breathy kind of whisper-y sound. Now, if you think of Marilyn Monroe, she exaggerated both of those qualities. So she spoke in a Kewpie doll high voice with a lot of whisper-y edge to it. So the feeling is that men are sort of subconsciously, unconsciously hearing these things in a female voice and being attracted to it.

But I just hasten to add that all of this is very, very slippery because of course, there are women with deep voices that sound particularly sexy. Lauren Bacall, the actress from the 1940s was sort of famed for her deep voice. So you always have to weigh these theories against certain things that contradict them. But anyway, it’s… Her voice actually had some breathiness, so who knows?

Brett McKay: So another aspect, sort of a cultural influence on our voice, and I’m gonna bring this ’cause I’ve been hearing it, aboot, the Canadian accent.

John Colapinto: Ah, lovely! Yes, and…

Brett McKay: Let’s talk about accents. So why do… Right? Why do we have accents with our voices? Do we know why that happens?

John Colapinto: Yeah, you know what? It really is an astonishing thing. Accents are really like territorial sounds. They’re almost like ways for groups of people to indicate their membership with each other and their exclusion of an interloping other. And we know that from extraordinary studies by a guy named William Labov. And he started in the early 1960s studying the fishing people on Martha’s Vineyard, a tiny island off of Massachusetts. And there, he discovered that these fishing families were, oddly enough, starting to speak with an accent that was many generations old. It sort of lingered in the oldest people, but all of a sudden, the youngest people were doing it. He didn’t know why. He interviewed them, he did a sociological deep-dive, and discovered that, really, what they were doing was trying to differentiate themselves from the mainlanders from New York and Boston, the rich summer folk who would pull into town.

And it so happened in this time in the ’60s, the fishing families were going through a terrible economic crisis, and they were losing their homes. They were having to sell them at fire sale prices to the city folk. And they were moving out of these ancestral homes built by their fishing ancestors, and having to move into shacks on the inland. What this made them do was, A, despise those folk, and revert to an accent that not only said, “Hey, we’re part of this storied fishing past of America that’s in Moby Dick, for heaven’s sake! This was the economy of America. We belong to that special group. Furthermore, this is our island.”

Now, this indicates that accents are kind of about pushing people apart from each other. And the guy that wrote Pygmalion or My Fair Lady, George Bernard Shaw, knew this. He actually said, “An Englishman cannot open his mouth without making another Englishman hate or despise him.” Here in America, I think we pride ourselves on being a democratic society that doesn’t have a class structure as much. I’m afraid we do. You have Northerners looking down on southern-accented people. You’ve got flakes out in California speaking in Valley girl sounds. You’ve got mid-westerners with their Marge Gunderson from that Coen Brothers movie way of talking. All of these differentiations are about saying, “I belong to this crowd,” or, “I don’t belong that one.”

Brett McKay: No, it’s true. So I lived in Mexico for a few years, and even within Mexico, there’s differences, there’s regional accents. So you could tell if someone who’s from Mexico City or they’re from Sinaloa, or they’re from Veracruz. And then even within the Spanish-speaking world, there was… There was a lot of, I don’t know, sort of… In Spain, they speak with a lisp.

John Colapinto: Oh, yes, I’ve heard of that. Sure, absolutely.

Brett McKay: Right, so instead of saying like, “Sabado,” be a “Sabado”. And Mexicans, they make fun of that. And then the Spanish, they make fun of the Mexicans. They’re like, “You’re not speaking true Spanish.”

John Colapinto: Yes. Well, and the amazing thing is that these things do become essentially hardwired in childhood during that window that I talked about, that sort of critical period. And it’s why I say…

Or however I do it, I can’t even imitate it ’cause I’m unconscious of doing it, but people hear it. So yeah, these things are getting really cemented in very early, and you’ve gotta work hard to get rid of an accent.

Brett McKay: And it’s weird how things can develop. You talked about the development of that sort of Chicago accent, right? There’s an S in all…

John Colapinto: Oh, incredible.

Brett McKay: Yeah, the bears, the bulls. And you think that’s how they’ve always spoke in Chicago. But this one guy dug into it and was like, “No, it actually wasn’t until like the 1960s that people started talking like that.”

John Colapinto: It’s absolutely stunning. It was actually that same guy that did the Martha’s Vineyard study. He actually then… He looked at why people in that entire Great Lakes region spread out across the sort of northern middle part of the country. They’ve been increasingly saying, not fat and the name, Anne, fat Anne, they’re saying, “Fiet Anne.” And they’re speaking in this really distorted way with their vowels. So he looked at why. And he literally traced it to the migration of people in the mid to late 1800s from Upstate New York area when the Erie Canal, of all things, was built. They were suddenly able to people that part of the Great Lakes region.

Now, when they moved in there, they encountered another group of people who were upland Southerners from places like Kentucky. Now, it so happened that these groups had totally different approaches and values to life in almost every respects, from drinking, to whether or not stores should be open on Sunday, to whether or not women should have special rights. The Northerners were kind of like PC millennials of today. They were very sort of liberal. Now, the Southerners were kind of… They were into capital punishment, they were just more loosey-goosey about a lot of stuff. So you had this incredible culture clash.

Now, how did it manifest? As they became more acutely sort of aggressive towards each other, the Northerners thought, “Hey, I don’t wanna sound like them at all,” and they actually began to exaggerate that A sound in the word fat, let’s say. And they started to push it forward in the mouth, the tongue went higher. Fat to fet, fet to fiet. And over time, literally, and into the ’60s, this was still evolving because of course, the political divide has gotten no less. In fact, it’s gotten probably stronger. And you literally had the Northerners pushing the vowel forward, while the Southerners, I really believe, in response, they dropped the tongue back. They started to give it an last drawl. You know, “Fiet.” You know, why… I’m not gonna say “Fiet.” So you had these, these literally, these Americans pushing themselves apart with their voice, but also politically, socially, and in terms of value. This is how deep voice goes as an indicator of mass sort of movements in American society or in the world. It’s just fascinating, I think.

Brett McKay: Well, you mentioned one sort of accent that’s… It’s new in America and it’s the Valley girl type thing, it’s called uptalk. Most people talk about it like… Talk about uptalk. And the other one they typically, when you hear people talk about uptalk, they are also talking about vocal fry. For those who aren’t familiar with these concepts, can you describe what vocal fry and uptalk is? And has anyone studied why are younger people in the United States speaking with vocal fry and uptalk?

John Colapinto: Yeah, well, it’s interesting now. You know, uptalk is when people ends the even statements with a question, as if they’re always asking a question. And so they say that they’re gonna go to the store? Because we’re… And right now, we’re talking on a podcast? So that’s uptalk. The vocal fry is this growly sound that you’ll notice young women, largely, you tend to notice it in young women, but older and women are doing it, where it’s almost a crackly sound of the voice and it’s kind of like down there. I wish I could do it better, but…

Brett McKay: Yeah, like this.

John Colapinto: Yeah, exactly!

Brett McKay: Yeah, it’s like Kim Kardashian talks, with a vocal fry.

John Colapinto: Yeah, totally.

Brett McKay: Yes, hello?

John Colapinto: And you were asking, “Has anyone looked at where it came from? Why it’s become sort of this epidemic amongst American and Canadian women?” And the belief, literally, is that it can be traced to the popularity of the Kardashian’s TV show, which started in 2007, but peaked in its popularity in 2010, which is exactly when linguists became fascinated. They were like, “Why is every young woman talking this way?” Now, I wrote about this in the book, and I theorized it first, that women were doing it, particularly at that period, 2010, shortly after the big sort of economic crash of the subprime mortgage meltdown, where all of a sudden, millennials’ lives didn’t look too easy. It was a fearful time. Kim, however, was a pampered Beverly Hills, billionairess or whatever she was, millionairess, who had not a care in the world. And she spoke in this way that kind of erased all emotion from her voice because vocal fry, you can’t go up and down. You always sound like you’re kind of bland and blasÃ© and in control.

So my initial theory was that women were quite understandably disguising any anxieties they felt about the future and about life in this imitation of Kim. But the thing is it’s gone on way too long. It’s still accelerating in the world. My new belief is that that initial use might sort of morphed into something else after the 2016 election, where you had the rise of the Me Too movement and you had women now with a sort of newly energized feminism that really derive from a feeling that the government was being run by people that were inimical to women and their rights. So suddenly, the vocal fry became really an assertion, a growl, a way of women to say to men, “I really mean business.”

And in the book, I point out that in the ’70s, we had, “I am woman, hear me roar.” But that roars are kind of theatrical. But a growl. A growl is actually something that’s kind of… It’s across the animal kingdom. And it’s produced exactly like the vocal fry. It’s the same set of laryngeal muscles. You actually tighten and stiffen the vocal cords so the air moves through them in these crackling sort of bubbles. And when a woman, my theory is, is doing that I think she’s sending a signal that, “I demand to be taken, not just seriously, but as a legitimate threat to you, if you abuse me.” And I think that’s might be why we’re hearing the fry as much as we could do in women. Men also do it though, which is not often pointed out.

Brett McKay: Yeah, dudes do it, and I’ve talked to a vocal coach about this, and he said when you see guys doing vocal fry, they’re usually trying to make their voice deeper, because they don’t have that.

John Colapinto: Ah, yeah.

Brett McKay: But he says it’s not great for your voice. It’s not the best thing to do ’cause actually it can…

John Colapinto: Yeah, they say it’s hard on the vocal cords. I think, right?

Brett McKay: Yeah. But then the up-talk, why do you think these drive people nuts?

John Colapinto: Oh wow [chuckle]

Brett McKay: To me, it just bugs the crap out of me. ‘Cause I don’t know, are you making a statement, are you making a question? What’s going on?

John Colapinto: Oh, I can’t believe that, Brett. Why does it bother you, Brett? No, it’s so annoying! You know what it is? It sounds defensive because interestingly, I mentioned before that we deepen our voice in order to sound threatening. But we raise our voice in order to sound submissive and loving, and we speak this way to babies and to pets just very naturally. We just do that. It’s a reverse size bluff, we’re trying to sound small. Now, a wonderful linguist named John Ohala from Stanford years ago said, “That’s literally why almost every language has a raised pitch at the end for a question.” Because when you ask a question, you are literally becoming submissive to… You’re giving up control and authority to the person you’re asking the question of. Your voice goes up. “Is it nice out?” You don’t know so you’re becoming submissive. The person says, “Yes, it’s really nice,” and they deliberately answer in a voice that does not go up but down. When someone is doing up-talk, you really get the feeling that they’re constantly asserting sort of a non-threatening submissive, “Oh, I’m just like… I don’t know anything, I’m just totally at your mercy and you know everything,” and that starts to grate.

You might sort of think, “Oh gee, I guess I would kind of like someone that sort of makes me feel dominant and… ” No. We prefer to try to work with people as equals if we’re sane and normal. Someone that’s constantly putting themselves in a position of sort of supplicating, questioning, submissiveness, you wanna slap ’em. One of the places it might have become so popular, and Labov said this as well… Incredibly enough, was the song by Moon Unit Zappa back in the very early ’80s, whose name is gonna elude me that song… No, I can’t believe it. It’s in my book where she raps in it too. Oh, well, she also does a little bit of vocal fry, but it goes way way back to then possibly. And then the movie Clueless picked it up, and that was a very popular movie, the girls there did it. Now and that suggests too the things in our culture are highly contagious, that things about voice are very easily picked up as fads, short-lived fads, but then they can become like almost permanent aspects of accent as they get passed down to children when they’re in the crib. So that might also be why we’re hearing so much upspeak and vocal fry.

Brett McKay: So we talked about too, when you speak, you’re not just conveying information with words and language, your voice itself can carry information about your emotional state. And it can be very subtle, it’s not just like… I think everyone knows what a happy voice sounds like, a sad voice, but there’s like these really subtle ones, like annoyed, disdain, contempt that you can pick up on in a voice. So… Do we know… Have scientists figured out, put an algorithm, they know… If they hear this voice, this is contempt. Do we know about that?

John Colapinto: Yeah. Great question. For years, decades or one particular scientist in the… I don’t know how many years, 40 years he devoted to it, trying to study emotion in voice and think of how hard that is to do because… Science, you want someone to be able to repeat a particular behavior so that you can be sure you weren’t getting a one-off sound. So how do you make someone make a jealous statement that sounds jealous or envious, or hostile, or happy or sad, or some weird blend of those emotions. So he figured out ways of doing that, he started using method actors and actually that worked pretty well, and he scrupulously dissected the acoustic signal using instruments called a spectrograph and oscilloscopes, and this guy drew up these huge charts with decimal pointed measurements of volume versus pitch versus the speed.

Now, you can imagine how incredibly complex that was because it’s literally those tiny adjustments. Now, the thing is that his work ended up being essentially worthless because it was just like… Who could do anything with this? But you mentioned a really interesting thing when you say,” Is there an algorithm?” Because what the big tech companies are doing now is trying to imbue computer voices with convincing-sounding human emotion, and the way they’re actually doing it is not by using all those little micro-measurements like this guy did, because you could never input all of that into a computer. They’re using machine learning, they write an algorithm where the computer can teach itself, and then they play emotional speech into the computer with the emotions label, and the computer literally learns them itself. It’s an astounding thing and actually not… A little bit scary, I have to say, because we’ve suddenly got computers learning the way babies do, I talked about how we sort of inculcate babies with speech by them hearing speech.

Well, that’s how our computers are getting so good at doing language, but now they’re getting good at the secret ingredient of language that really makes us sound human, which is that emotional… What they call it prosody, it’s the song like part of our voice, and if you don’t use prosody you become unlistenable, that was horrible what I just did. And if I kept it up, you’d have ended the podcast, so you’ve gotta have the music and… Yeah, really we don’t know to… Now, here’s the short answer, we don’t really know what all those variables are, but computers are learning them anyway. It’s amazing.

Brett McKay: Yeah, next time you ask your Alexa what the weather is, it’s gonna ask, “John, you sound kind of annoyed”, is… Right.

John Colapinto: Correct. That’s exactly what they say, or are you lying? They might call you out on a lie, this where the nightmare could go.

Brett McKay: That’s creepy. Yeah, that’s why I don’t have an Alexa. God, no don’t do that.

John Colapinto: Well, those Alexas are gonna start learning from your voice, I mentioned that, ’cause they’re gonna start putting that learning software into these devices that we use, Siris and Alexas and so on. And so every time we speak, when we call it up to say, “Hey, when is Justin Bieber’s birthday?” something we all are curious about, it’s literally gonna be learning some of the prosody of our language and speech from that.

Brett McKay: Well, another aspect of voice that you hit is singing, this kick-started the whole project, you were a singer, you’ve got a polyp on your voice, on your vocal cords. What is it about singing that makes people feel really vulnerable? Why don’t we like to sing?

John Colapinto: It’s so weird. It strips as naked. It’s this way in which… I mentioned in the book that if you could say to someone in a work environment, “Hey, could you get up and deliver this report?” even if they don’t like public speaking, they could fumble their way through it, try asking them to sing a solo song to all their co-workers. The minute you lift your voice out of normal speech and launch it into the melody and rhythm, it strips you… Us to our human core in a way that science doesn’t really understand. We just know it when it happens, and it touches on emotions deep in us that are maybe it goes back to our moms singing to us when we were babies, but I think it goes further back than that.

Darwin thought that our language itself emerged out of the singing cadences of early primates. So it’s almost as if funnily enough, even though we exalt our linguistic capability, which I believe brought us to the top of the food chain, in a funny way, it’s finally though that music where we find our most human universal emotional salience and to sing is to just totally bear ourselves, bear ourselves nakedly. And that works in certain ways that it can be so beautiful, and I do point this out in the book that when President Obama was addressing the Charleston church where there’d been a horrible shooting, many people killed.

He started to speak and then he fell silent for 12 seconds, which is an infinitely long period of time to be silent. And when he next made a vocal noise, it was to start singing Amazing Grace. And he actually… Someone just recently told me that I was correct, as he recently said that he had run out of words, there was no words for how horrible this was, and the only recourse was to singing. Now, one of the world’s greatest orators, Obama, that’s quite an admission. I can’t really answer what that mysterious sort of soul-fullness is, except to just say it’s real, it’s true, and it’s a beautiful mystery.

Brett McKay: Going onto that, you talk about pop music today. One of the complaints… Some of it’s really complex and sophisticated, but one of the complaints that people have and music critics have is that it’s too perfect because we have this technology that allows us to take a singer’s voice and auto-tune it, make it sound different, put them back in pitch and put things on beat. And when you hear it, you’re like, “That doesn’t sound right. Doesn’t stir me.”

John Colapinto: Yes.

Brett McKay: What is it about the imperfection that started… Did we figure that out?

John Colapinto: Yeah, it’s fascinating. A guy actually, back in the ’20s, the 1920s, studied singing voices, and was actually amazed to discover that all singers, even highly trained classical singers, are singing off pitch very often or starting off the pitch and then moving into the correct pitch and then moving out again. They’re jumping on beats a little bit in advance or lagging behind in order to create emotional effects. There’s something called vibrato, “Ah,” where you’re literally going between two notes, but it sounds like you’re sort of on the pitch of one note, but if you think about it, you’re not. You’re wobbling. And these create emotion. Now, why on earth that is? It’s sort of like if you look at a painting, a beautiful impressionist painting, Monet, you’re gonna see his brush strokes. And if you get close to the canvas, you’ll suddenly realize, “Wow, that’s kind of messy and that kinda looks wrong.” You step back for a minute and it all assembles into something beautiful and alive.

Japanese calligraphers with ink and brush pen, they love the little mistakes. They would put in mistakes into things that they made, because that’s where we find our humanness. Now, the sort of “mistakes” in the human voice are absolutely real in singing, we almost… We can’t really control them. These emerge as part of the emotional expression of a singing voice, the voice trying to find a note. It’s got a yearning quality of seeking the note, of seeking the thing that’s gonna unlock our emotion. And it’s beautiful to hear the voice get there. Now, when you use Pro Tools to just center the person right on pitch and right on the beat, you’re necessarily draining off a whole bunch of the humanity. And it’s just surreal, you can hear that. Listen to Bob Dylan who’s not auto-tuned, and he’s not even a great singer in terms of pitch and stuff, but he’ll break your heart with certain songs.

Taylor Swift, who was… When she first broke, I remember, with the album “Fearless,” I saw her on Letterman, and she was very touching, and I remember thinking it was because she was kind of a lousy singer. She was a little off pitch, but she was singing about her first emotional affair as a young woman and my God! Her being sort of not quite on the beat and a little uncertain in her pitches, it absolutely contributed to this entire beautiful emotional effect of vulnerability. Now she’s a big stadium dance, EDM singer, with propulsive beats, and robotic sounds, and they’re centering her voice on pitch with Pro Tools and it’s all gone. The emotions gone. She’s a different singer now, a successful one. But she’s not gonna break your heart as readily.

Brett McKay: Do you know of… Are there musicians who are rebelling against the pro… The Pro Tools?

John Colapinto: I believe they are, yes there are. But I say that ’cause they say it. I wish a name would come to mind, but they kind of like to boast that they didn’t use… But to be honest, people do slide off pitch on take 35 of a song that they’re singing and sometimes the engineer just nudges them onto the pitch. I guess the point is that it’s irresistible. With all of our computers now, you just… You just know you can hit that button and get rid of the zit before you put the photograph of yourself up, or whatever. Do you really lay off that zit remover click? So I don’t know. It’s a really… Actually, I should have asked some engineers this, whether or not anybody sings in their naked… I would imagine Dylan, but I don’t know who else.

Brett McKay: No, that’s weird. Yeah, that’s funny how we have that propensity to wanna remove our humanity. But then…

John Colapinto: Yeah.

Brett McKay: But once you do that with Photoshop or the audio tools, you enter that uncanny valley and you’re like, “This doesn’t… ”

John Colapinto: Correct.

Brett McKay: “This is not right,” but you still have the compulsion, “I need to do it.” [laughter] It’s weird.

John Colapinto: I know exactly, I know. “I wanna be perfect!”

Brett McKay: Right. In the book, I thought it was interesting, about how… What happens to our voice when we get old. And I never really thought about that, I was like, “Oh.” I thought about going grey, I thought about getting wrinkly, but I never thought my voice would age as well, but it does.

John Colapinto: Yes. Yes, it does. Every component of voice that we talked about earlier, everything from your lungs to your vocal cords, to the articulators, lips and tongue, all of them basically degrade. They get weaker. They get less powerful. They get less precise, so you sort of hit the wrong tongue targets, so your speech starts to sound a little blurry. But one of the biggest things that happen is that your vocal cords, it’s sort of like your knee ligaments or something, they tighten up, they stiffen, they get arthritic. And a young voice has a ripple, a beautiful, almost liquid ripple to the way the vocal cords vibrate and move. That goes away. They really become stiff, and kind of crystalline and kinda hard. And you start to hear that in a croaky old person’s voice. The voices get more quiet ’cause you don’t have as much lung power as an old person. All of a sudden, the muscles are weakening in your… We sound loud by pushing that diaphragm, it’s a big muscle. All sorts of stuff happens.

Fat collects around the neck and literally makes the resonance chamber of the throat kind of smaller. Women’s voices lower with age because their vocal cords gets kind of thicker and stuff, and men’s kind of raise. So men and women, when we talked about the dimorphism of male and female voices for mating, well, when mating is no longer an option, the voices actually move together. They start to sound more and more of the same, men and women. So you’re sort of de-sexed, your power goes away. It’s not pretty, and you really hear that in aged voices.

Brett McKay: So what do you hope people will walk away after reading your books? It’s like as you said in the very beginning, like this is not a how-to book on how to sound like James Earl Jones or Frank Sinatra or whatever, but what do you hope people walk away with?

John Colapinto: Yeah, I hope they end up with an impression of I guess maybe two things. Just how important our voice was to us as a species. I really do think it drove us to the top of the food chain. But I also do, by talking about all the different things that control and make our voice. I hope that people will glory in their own voices, weirdly enough, a little bit, because it is an instrument. You watch a wonderful clarinet player or a saxophone player and you’re impressed with what they’re doing. When you just say, “Pass the salt,” you’re doing something more remarkable in terms of speed of movement and precision of movement and so on, and even tune and pitches. So I would love to see people push their voice out there, project it, use your articulate, animate your voice, get excited, sound enthusiastic, possess the air around you with these vibrations because we’re not here that long. So I guess my feeling is, I do sort of end with a little bit of uplift because our voices are one of the main ways that we imprint ourselves on the world that we occupy. It’s part of the web that we extend in order to connect to everybody else, and you might as well give it some power, give it some style, speak up, don’t be shy. So I guess that’s where I would go with that.

Brett McKay: Well, John, this has been a great conversation. Where can people go to learn more about the book and your work?

John Colapinto: Well, you can definitely look at my Twitter because I’m always tweeting… JColapinto… I’m always tweeting about it, but just read the book. It’s on Amazon, you can get it at book stores, if you can go in them at this point with a mask on. And I actually do think I have to sort of rather hubristically say that my book actually pulls together all of these different strands and disciplines to look at the voice in this global way that frankly no other book really does or has done. That was actually one of the great challenges of doing it. You got books on linguistics, phonetics, singing, oratory, accents, but they’re all separate books and separate fields of study, so for a book that… I really am selling it here… But for a book that kind of pulls it all together in a narrative way, you might just go to my book, This Is The Voice. I don’t know where else to point you.

Brett McKay: John, this has been a great conversation. Thanks for your time, it’s been a pleasure.

John Colapinto: Brett, likewise.

Brett McKay: My guest today was John Colapinto. He’s the author of the book, This Is The Voice. It’s available on amazon.com and bookstores everywhere. Make sure to check out our show notes at aom.is/thisisthevoice where you can find links to resources where you can delve deeper into this topic.

Well, that wraps up another edition of the AOM podcast. Check our website at artofmanliness.com, where you find our podcast archives as well as thousands of articles written over the years, and if you’d like to enjoy ad-free episodes of the AOM podcast, you can do so on Stitcher Premium. Head over to stitcherpremium.com, sign up, use code MANLINESS at check out for a free month trial. Once you’re signed up, download the Stitcher app on Android or iOS and you can start enjoying ad-free episodes of the AOM podcast. And if you haven’t done so already, I’d appreciate if you take one minute to give us a review on Apple podcast or Stitcher. It helps out a lot, and if you’ve done that already, thank you. Please consider sharing the show with a friend or a family member who you think would get something out of it. As always, thank you for the continued support. Until next time, this is Brett McKay, reminding you not only to listen to the AOM podcast, but put what you’ve heard into action.

Jay Lester

Submitted by: Jay Lester in Coushatta

Show Highlights

Resources/Articles/People Mentioned in Podcast

Connect With John

Listen to the Podcast! (And donâ€™t forget to leave us a review!)

Podcast Sponsors

Read the Transcript