Harnessed: How Language and Music Mimicked Nature and Transformed Ape to Man (13 page)

If music comes from speech, then it doesn’t come from the phonological patterns of speech, or from the semantics of speech. Although these core functions of speech are dead ends for a theory of music, there is another aspect of speech I have purposely glossed over. People overlay the sterile solid-object event sounds of speech with emotional overtones. We add intonation, a pitchlike property. We vary the emphasis of the words in a sentence, reminiscent of the way rhythm bestows emphasis in music (for instance, the first beat in a measure usually has enhanced emphasis). We vary the timing of the word utterances, akin to the temporal patterns of rhythm in music. And we sometimes modulate the overall loudness of our voices, like a musical crescendo or diminuendo. These prosody-related emotional overtones turn Stephen Hawking computer-voice speech into regular human speech. And these emotional overtones can be understood even in foreign speech, where our ears can often recognize the glib, the mournful, the proud, and the angry. We’re just not sure what they are glib, mournful, proud, or angry
about
.

So it is not quite true that speech sounds are sterile. Rather, it is the phonological solid-object event sounds that are sterile. The overtones of speech, on the other hand, are
dripping
with human emotion. Might these overtones underlie music? In an effort to answer, let’s discuss the four questions at the heart of any theory of music, the ones I referred to earlier as “brain,” “emotion,” “dance,” and “structure.”

Do we have a brain for the overtones of speech? An overtone theory of music would like to say that music “works” on our brains because it taps into speech overtone recognition mechanisms. Are we likely to have neural mechanisms for recognizing overtones of speech? Although I am suggesting in this book that we did not evolve to possess
speech
recognition mechanisms, we primates have been making nonspeech vocalizations (cries, laughs, shrieks, growls, moans, sighs, and so on) for tens of millions of years, and surely we have evolved neural mechanisms to recognize them. Perhaps the overtones of speech come from our ancient nonspeech vocalizations, and they get laid on top of the solid-object physical event sounds of speech like a whipped cream of evocativeness, a whipped cream our auditory system knows how to taste. An overtone-based theory of music, then,
does
have a plausible story to tell about why our brain would be highly efficient at recognizing overtones.

Can overtones potentially explain the evocativeness of music, the second hurdle we had discussed for any theory of music? Of course! Overtones are
emotional
, used in vocalization to
be
evocative. If music mimics emotional overtones, then it is
easy
to grasp how music can be evocative.

Can an overtone theory of music explain dance, the third hurdle I mentioned earlier? One can see how the emotional nonspeech vocalization of other people around us might provoke us into action of some kind—that’s probably why people are vocalizing in the first place. That’s a start. But we would like to know why hearing overtones would not just tend to provoke us to do stuff, but more specifically, make us move in a time-locked fashion to the emotional vocalizations. I have not been able to fathom any overtone-related story that could explain this, and the absence of any potential connection to dance is a hurdle that an overtone theory stumbles over.

Finally, can overtones explain the structure of music? Do the overtones of speech possess the patterns of pitch, loudness, and rhythm found in music? There is, at least, enough structure floating around in the prosody of speech that one can imagine it might be rich enough to help explain the structure found in music. But despite the nice confluence between ingredients in the overtones of speech and certain similar ingredients in music, overtones appear to be a very different beast from music. First and foremost, what’s missing in the overtones of speech is a
beat
, and a rhythm time-locked to a beat. That’s the
one
thing the lub-dub theory of music captured, but it is one of the most glaring shortcomings of overtone-based approaches, and it ultimately takes overtones of speech out of the running as a basis for a theory of music.

Before leaving speech for more fertile grounds—in fact, the next section is about sex—consider the two hurdles where overtones appeared promising: “brain” and “emotion.” I suggested earlier in this section that overtones could rely on ancient human nonlinguistic vocalizations, but there is another potential foundation for overtones’ evocative nature: the sounds of people moving. Rather than music coming from the overtones of speech, perhaps
both
music and overtones have their foundation in the more fundamentally meaningful sound patterns of humans’ expressive movements. (And perhaps this is the source of the intersections between music and speech in the brain discerned by Aniruddh D. Patel of the Neurosciences Institute in La Jolla, and other researchers.)

How About Sex?

Music does not appear to have its origins in the beating heart or in the overtones of speech. That’s where I stood on the problem as recently as 2007, when I had recently left Caltech for RPI. I was confident that music was
not
lub-dubs or speech, but I had no idea what music could be. I did, however, have a good idea of some severe constraints any theory of music must satisfy, namely the four hurdles we discussed earlier: brain, emotion, dance, and structure. After racking my brain for some months, and perhaps helped along by the fact that my wife was several months delayed in following me across country to my new job, it struck me: how about sex?

Reputable scientific articles—or perhaps I saw this in one of the women’s magazines on my wife’s bedside table—indicate that to have sex successfully, satisfying both partners and (if so desired) optimizing the chances of conception, the couple’s movements should be in sync with each other. Accordingly, one might imagine that we have been selected to respond to the rhythmic sex sounds of our partner by feeling the urge to match our own movement to his or hers. Evolution would select against people who did not “dance” upon hearing sex moves, and it would also select against people who responded with the sex dance every time a handshake was sufficiently vigorous. The auditory system would thus come to possess mechanisms for accurately detecting the sexual sounds of our partner. A “sex theory of music” of this kind has, then, a story for the “brain” hurdle.

In addition to satisfying the “brain” hurdle, the sex theory also has the beginnings of stories for the other three hurdles. Emotion? Sex concerns hot, steamy bodies, which is, ahem, evocative. Dance? The sex theory explains why we would feel compelled to move to the beat, thereby potentially addressing the “dance” hurdle. (In fact, perhaps the “sex theory” could explain why dance moves are so often packed with sexual overtones.) And, finally, structure? The sounds of sex often have a beat, the most essential structural feature of music a theory needs to explain.

I was on a roll! But before getting Hugh Hefner on the phone to go over the implications, I needed to figure out how to test the hypothesis. That’s simple, I thought. If music sounds like sex, then we should find the signature sounds of sex in music. The question then became, what
are
the signature sounds of sex? What I needed was to collect data from pornography. That, however, would surely land me in a heap of trouble of one kind or another, so I went with the next best thing: anthropology. I began searching for studies of human sexual intercourse, and in particular for “scores” notating the behavior and vocalizations of couples in the act. I also found scores of this kind for nonhuman primates—not my bag—which, I discovered, contain noticeably more instances of “biting” and “baring teeth” than most human encounters. My hope was to find enough of these so that I could compile an average “score” for a sexual encounter, and use it as a predictor of the length, tempo, pitch modulation, loudness modulation, and rhythm modulation of music.

I couldn’t find but a handful of such scores, and I did not have the chutzpah to acquire scores of my own. So I gave it up. I could have pushed harder to find data, but it seemed clear to me that, despite its initial promise, sex was far too narrow to possibly explain music. If music sounded like sex, then why isn’t all music sexy? And why does music evoke such a wide range of emotions, far beyond those that occur in the heat of sex? And how can the simple rhythmic sounds of sex possibly have enough structure to explain musical structure? Without answers to these questions, it was clear that I would have to take sex off the table.

Enough with the things I don’t think can explain music (heartbeats, speech, and sex)! It is about time I begin saying what I think music
does
sound like. And let’s edge closer to that by examining what music
looks
like.

Believe Your Eyes and Earworms

It is natural to assume that the visual information streaming into our eyes determines the visual perceptions we end up with, and that the auditory information entering our ears determines the events we hear. But the brain is more complicated than this. Visual and auditory information
interact
in the brain, and the brain utilizes both to guess what single scene to render a perception of. For example, the research of Ladan Shams, Yukiyasu Kamitani, and Shinsuke Shimojo at Caltech have shown that we perceive a single flash as a double flash if it is paired with a double beep. And Robert Sekuler and others from Brandeis University have shown that if a sound occurs at the time when the images of two balls pass through each other on a screen, the balls are instead perceived to have collided and reversed direction. These and other results of this kind demonstrate the interconnectedness of visual and auditory information in our brain. Visual ambiguity can be reduced by auditory information, and vice versa. And, generally, both are brought to bear in the brain’s attempt to guess about what’s out there.

Your brain, then, does not consist of independent visual and auditory systems, with separate troves of visual and auditory knowledge about the world. Instead, vision and audition talk to one another, and there are regions of cortex responsible for making vision and audition fit one another. These regions know about the sounds of looks and the looks of sounds. Because of this, when your brain hears something but cannot see it, your brain does not just sit there and refrain from guessing what it might have looked like. When your auditory system makes sense of something, it will have a tendency to activate visual areas, eliciting imagery of its best guess as to the appearance of the stuff making the sound. For example, when you hear the sound of your neighbor’s tree rustling, an image of its swaying, lanky branches may spring to mind. The mewing of your cat heard far away may evoke an image of it stuck high up in that tree. And the pumping of your neighbor’s kid’s BB gun can bring forth an image of the gun being pointed at Foofy way up there.

Your visual system, then, has strong opinions about the likely look of the things you hear. And, to get back to music, we can
use
the visual system’s strong opinions as an aid in gauging music’s meaning. In particular, we can ask your visual system what
it
thinks the appropriate visual is for music. If, for example, the visual system responds to music with images of beating hearts, then it would suggest, to my disbelief, that music mimics the sounds of heartbeats. If, instead, the visual system responds with images of pornography, then it would suggest that music sounds like sex. You get the idea.

But to get the visual system to act like an oracle, we need to get it to speak. How are we to know what the visual system thinks music looks like? One approach is to simply ask what visuals are routinely associated with music. For example, when people create imagery of musical notes, what does it look like? One cheap way to find out is simply to do a Google (or any search engine) image search on the term “musical notes.” You might think such a search would merely return images of simple notes on the page. However, that is not what one finds. To my surprise, actually, most of the images are like the one in Figure 16, with notes drawn in such a way that they appear to be moving through space. Notes in musical notation don’t look anything like this, and
actual
musical notes have no look
at all
(because they are sounds). And yet we humans seem prone to visually depict notes in lively motion.

 

Figure 16
. Musical notes tend to be visualized like this, a clue to their meaning.

 

Could these images of notes in motion be due to a more mundane association? Music is played by people, and people have to
move
to play their instruments. Could
this
be the source of the movement-music association? I don’t think so, because the movement suggested in these images of notes doesn’t
look
anything like an instrument being played. In fact, it is common to show images of an instrument with the notes beginning their movement through space from the instrument: these notes are on their way somewhere, not tied to the musician’s key-pressing or back-and-forth swaying.

Could it be that the musical notes are depicted as moving through space because
sound waves
move through space? The difficulty with this hypothesis is that
all
sound moves through space. All sound would, if this were so, be visually rendered as moving through space, but that’s not how we portray most sounds. For example,
speech
is not usually visually rendered as moving through space. Another difficulty is that the musical notes in these images are usually meandering, but sound waves don’t meander—sound waves go straight. A third problem with the notion that sound waves are the basis for the visual metaphor is that we never
see
sound waves in the first place.

Another possible counterhypothesis is that musical notes are visually depicted in motion because all auditory stimuli are caused by underlying events that involve movement of some kind. The first difficulty, as with sound waves, is that not all sound, by a long shot, is visually rendered as in motion. The second difficulty is that, while it is true that sounds are typically generated by movement of some kind, it need not be movement of an entire object through space. Moving parts
within
the object may make the noise, without the object going anywhere. In fact, the three examples I gave at the start of this section—leaves rustling, Foofy mewing, and the BB gun pumping—are noises without any bulk movement of the object (the tree, Foofy, or the BB gun, respectively). The musical notes in these images, on the other hand, really do seem to be moving their whole selves across space.

Music is like rustling leaves, Foofy, BB guns, and human speech, in that it is not made by bulk movements through space. And yet music appears uniquely likely to be visually depicted as notes moving through space. And not only moving, but meandering. When visually rendered, music looks alive and in motion (often along the ground)—just what one might expect if music’s secret is that it sounds like people moving.

A Google image search on “musical notes” is one way to try to discern what the visual system thinks music looks like. Another is simply to ask ourselves: what is the most common visual display shown during music? That is, if people were to make videos to go with music, what would the videos tend to look like? Luckily for us, people
do
make videos to go with music! They’re called music videos, of course. And what do they look like? The answer is so obvious that it hardly seems worth noting: music videos commonly show people moving about, usually in a manner that is time-locked to the music, very often dancing. As obvious as it is that music videos typically show people moving, we must remember to ask ourselves why music isn’t typically visually associated with something very different. Why aren’t music videos mostly of rivers, avalanches, car races, windblown grass, lions hunting, fire, or bouncing balls? It is because, I am suggesting, our brain thinks that humans moving about is what music should look like . . . because it thinks that humans moving about is what music sounds like.

Other books

Track of the Cat by Nevada Barr
The Fall of Never by Ronald Malfi
Shockwave by Andrew Vachss
The Broken Ones by Stephen M. Irwin
Life Deluxe by Jens Lapidus
Nowhere to Hide by Terry Odell