We all experience music almost every day that gives rise to various emotions, though commercials, films, games or our personal connection with our favorite album. It is commonly known informally that major keys make happier melodies than minor keys, and different scales and performative elements can make music sound more sad or happy, depending on the characteristics of those elements.

But what is it about the sound of a low legato cello that makes it more sad than a high jumpy melody on a marimba? In this post, we take a look at how sounds can evoke emotions, starting with its building blocks.

What is Timbre?

We call our subjective understanding of a sound’s frequency spectrum timbre. Think of timbre as the sonic fingerprint or the quality of the sound. It is the thing that makes you recognise a violin over a piano, regardless of what note is being played.

Each note played an instrument has an identifiable pitch or tone. This is the fundamental frequency that is the central or most salient pitch of the note. On top of this fundamental, there is a series of harmonics or partials in a sequence. Take the note “A“, tuned to a frequency of 440Hz. The first partial would be the octave above, 440Hz times two, which is 880Hz. The next partial would be the fundamental times three, which is actually an interval of a “fifth” in musical terms. The next partial is then the fundamental times four, two octaves above, and so on. What makes up the characteristic timbre of an instrument is the strength of the individual partials in relation to each other. Many different factors shape the strength of the partials, such as the physical materials used in the making of the instrument body and how that affects a resonating column of air, a particular string vibration characteristics, or whether an instrument is played loudly or quietly.

Sound frequencies also develop over time. Plucking a string with a plastic guitar pick gives a short burst of noise, with harmonic and inharmonic partials which quickly fade out (within the first 30 ms) leaving room for the pure partials of the string as it continues to vibrate. These also fade over time, starting with the highest frequencies, working it’s way down. This whole envelope has an initial “plucky” attack and becomes softer over time; as it fades in volume it also fades in its frequency spectrum. This can be seen in spectrogram in the figure below.  

A spectrogram shows a plucked guitar string frequency spectrum over time. Frequency is mapped on the vertical axis, and time in seconds and the horizontal. The brighter the color, the more energy.


This is not the case with all instruments. Wind instruments, for example, don’t fade in the same way in timbre and volume over time, since they need air to be constantly blown through to make a sound.

Non-tonal sounds have much more complex timbre, but follow the same principles of composition. We get a lot of information in our daily lives from the sound of objects, and we can often tell the the type of material or the approximate size/weight of an item from those sounds. From experience, we have taught ourselves the sound of a pencil dropping on a wooden floor, or can recognise the sound of a finger tapping an empty plastic bottle – by its hollow, dull and round sound. When describing timbres, we associate the characteristics of the sound by comparing it to objects we have experienced with similar sound properties. Describing something as thin could refer to a sound “not being full” or lacking a body that resonates with a lower frequency. Bass frequencies in the real world often come from big materials, or bigger resonating chambers, therefore “thin” refers to a smaller object rather than “bassy” and “deep” or “full” sounds which would be associated with bigger objects.  

Descriptors for different timbre groups have been in used since as early as 1877, when Hermann von Helmholtz came up with a set of descriptors for different timbre groups (see the figure below, from Howard and Tyrell). We see how different examples of grouped acoustic instruments are tied to specific descriptors, and how some of the descriptors have an emotional sensation tied to them.


Striations are frequencies above the 7th harmonic, where the spacing between the individual partials becomes harder for the ear to perceive as individuals, thus being perceived as noisy or more harsh if they dominate.


The human ear is very sensitive to transients, the “attack” of a sound, possibly because it has been beneficial for us in evolutionary terms to hear sudden change in the world around us. Loud transient sounds usually mean some rapid change, which would often be associated with danger. Smaller or “thinner” sounds might not seem all that dangerous, because usually smaller things, such as animals aren’t usually a threat, whereas a big object with a lot of bass could be a landslide or big tree falling, which could kill us. Pure tones or timbre where even harmonics dominate, provide a more round tone, and might for most of us be associated with a more friendly warm and happy sound, whereas more odd harmonics with noisy content might be harsher for our ears and thus seem more unfriendly. Indeed, many of these descriptors do to some extent describe the sensation or feeling of those timbres when we hear something as ”harsh” or “round”.

Emotional response to timbre could also have an origin in how we as humans express and perceive emotions through the timbral changes in the sounds of our own voices. When we are angry we might shout, which increase loudness and introduces more distorted vocals, with more partials, whereas a tender emotion might be expressed with a quieter softer and rounder sounding voice, and with a slightly higher pitch than a sad voice.

Various bodies of research show that different musical instruments evoke certain emotions. The reason for this is contested.  The emotive connotations of instruments might be something we have been “taught” culturally from theatre or opera, and now in films and games, although there seem to be broader commonalities between emotions and the specific categories of timbre, which may apply cross-culturally. For example, slower attack and lower-register instruments with more striations, such as the cello and bassoon are perceived as more sad, whereas brighter, short-attack instruments, such as marimba or piano points towards happier sounds.


Picking Timbre by Instrumentation

Now that we’ve looked at how different timbres can nudge sounds in specific emotional directions, let’s look at some examples of how this has been done by musicians. Sergei Prokofiev’s “Peter And The Wolf” (1936) is a good example of instrumentation that is tied very closely to characters. Each character in the story is represented by an instrument that has a timbre that hints at their personality, vocal characteristics, and role in the story. As examples, birds are representing by the flute, a duck by the oboe, the wolf by the horns, and Peter, the protagonist, by strings.

This can be explained more technically. The slightly more nasal sound of the oboe resembles the quacking  of a duck, the thin bright sound of the flute reassembles birds chirping. The slightly more hollow and light, but sneaky timbre of the clarinet represents the cat. The howling, round and deep sound of three french horns and the slight dissonance between them gives the sensation of something a little bigger, perhaps evil, representing the wolf. The main thing to take away from these short examples is how the spectrum of an instrument can be assigned to characters, resembling their real-world sound characteristics. It is quite easy to hear the difference between a silver flute and a real bird chirping in the woods, but it is more about how we think about these similarities and also the feelings that might be tied to these animals. Of course, harmonic progressions and melodies provide emotions as well, but the pure sound of the instruments and the performance is a clear example of how we connect timbre to emotion.

Another example of how instrumentation is tied to emotion and character, and within the context of less “traditional” harmony, is the “Joker’s theme” from the “The Dark Knight”, by Hans Zimmer. The piece focuses much more on the eerie feeling and dissonance of the instruments than Peter and the Wolf. The main sound consists of one “object”: distorted guitars, cellos and recorded material, such as piano strings played with razor blades.  

The strings, distorted partly by effect processing and  by performance, give a feeling of tension and keep the listener on edge. Most people are to some extent familiar with how a string instrument is played and are able to recognise the tension of the strings being forced by the bow. These techniques present even more partials and harsh timbre, as we hear the instrument being pushed. The harsh noisy sounds of metal against metal from the aforementioned razorblades, together with the distortion processing, adds to this even more, giving the sense of a system that is working over it’s limits. Tremolo is a well known performative way of adding tension, and is also a way of adding noise and harsh harmonics to a string instrument.

Here, both performance of the instruments and the processing work together to achieve the desired noisy, inharmonic timbre. Each layer is carefully sculpted to provide a part of the sound that in the end come together as one artificial instrument, where beating frequencies, noises, and combinations of timbre create a complex unpleasantness, contributing to the tension and “fear” that is tied to the characters in the story.


Designing emotional sounds

Sound designers have a lot more tools at hand than merely choosing a physical instrument. We have the possibility to sculpt and shape a sound using equalisers, boosting or reducing certain parts of the timbre. We can add more harmonics to a sound with different types of distortion and add space and movement with reverb, as well as delays and other effects, such as chorus, phasors or moving filters. Apart from this we have a whole range of different synthesis techniques to generate all sorts of interesting timbres that aren’t available with physical instruments.

For an overall direction or aesthetic for a sound, thinking in timbral qualities and letting them guide us in shaping instruments and synthesisers can help find the emotion we want. Say we want to create a sad instrument, we can go towards the hollow shrill and somewhat noisy type of cello. If we regard the whole sound as the instrument or object we can consider such parameters as what room it is playing in, and add other effects too. If we add a longer reverb to a cello, it could support a sense of feeling small in a big space.

If we have two cellos playing at the same time, with extra harmonics added by distortion, as well as being slightly detuned, the harmonics from beatings and distortion might contribute to a more eerie and uncertain feeling, as heard in the Zimmer example earlier. At the same time, loudness and harmonics could make the difference between an uneasy and an aggressive feeling, similar to how a person would express those feelings through voice, as mentioned earlier. In the end, it is not about perfectly recreating the sound of a cello. However, since we know it works as an instrumentation apt for sad melodies, we can be inspired by its timbral qualities.

Another important thing is not only to think of sound design as designing a single instrument or collection of sounds, but to look at how different roles play together. A little change in every instrument can, when all brought together, go a long way to the overall sound. The interplay between timbres is important, together with how each individual timbre affects emotions. This might be thought of as mixing in a more traditional sense, but in the end it is really sculpting sounds or timbre on a macro level, rather than a micro level.


Thinking of an instrument as everything from multiple sound sources in different layers, to various types of effects, and understanding that all these can play together in creating a whole, rather than considering just the basic instrument alone, can be important for broadening the timbral palette, helping to design instruments and sounds with emotional qualities. Since there is no blueprint for how timbre affects emotions, it can become a bit of a process of trial and error to find what things to add or subtract to a sound, relying on personal aesthetic judgment, in addition to various conventions to evoke emotions. Keeping all these things in mind is a good step towards advancing thinking about how to evoke emotions through sounds, supporting innovation in other musical features, such as harmony and performance.