As a music theorist, and a new member of the Melodrive team, I’m fascinated about how music in video games and VR has evolved and continues to evolve today. This is important to me because it concerns how music has changed in terms of its organisation and sophistication over the years. It is clear, even to someone who is a casual gamer like myself, that the aesthetic significance and evolution of video game music are critical concerns for the history of music, and demand serious attention. Particularly so since from an academic perspective at least, the genre is a little more niche than it deserves to be.

A Brief History of Music in Video Games

Video game music has humble origins, much like the games themselves. Even the idea of early video game music (c. 1970s and 1980s) brings with it associations of the crass atmosphere of the arcade, where it mixes with the music and sounds of Camptown Races, slot machines, lucky dips, and other more mechanical forms of entertainment (like air hockey). Any child of the 1980s, like myself, who played in arcades and with the fledgling personal computers and consoles that appeared on the scene at the time understands this connection (I hope it’s not just me!).

The connection between video game music and the arcade itself might come down to the evolving computer chip, which was used both in the video games and general entertainment industries, as well as the shared workspace and customers of the arcade. A crossover of content was inevitable from these beginnings.

The chiptunes of early video game music are now nostalgically revered, but just think about some of the limitations along with the successes. I remember my experience with the BBC Acorn Electron (‘The Elk’). Killer Gorilla (clone of Nintendo’s Donkey Kong), which used the ‘fate’ motif from Beethoven’s Fifth Symphony at the start of the game, is filled with the gambling jingles and other sound effects when players start, die, or get to the end of each stage. To me, the cloth-eared alterations to Beethoven’s theme are so severe it defies listening, let alone enjoying. (Yet in some way I still like it!)

Let’s take another example. Croaker, which was also released on The Elk, is a clone of Frogger by Konami (although Frogger contains more sophisticated, adaptive music). Croaker uses Fuer Elise (again by Beethoven) for the opening screen of the game, which includes micro changes of time and touch in the theme, but not in a good way. In addition there are wrong notes and ‘funny’ touches. Oh, the humanity!

Apart from the fact that this music for The Elk is hopelessly transcribed by the programmer, many of the soundtracks of 1980s video games come across as really dated and naff because of the awkward use of royalty free music, cardboard cutout jingles, and bad sound chips. All this results in aesthetic disengagement with the action of the player. Simply put, the standard outcome is that you get a child who is irritable and frustrated, but does not understand why.

However, something was on the horizon. In the mid to late 80s (depending on where you live), the release of Super Mario Bros for the NES represented a new dawn. With its harmonies and jazz sophistication (so good you didn’t even realise it was jazz), sectional contrasts, and 8-channel sound, Mario suited an industry that was appealing to a wider audience. Even if the end of the world had been 1987, Glenn McDonald’s appraisal in GameSpot would still be apt for Mario: Video game music has come a long way, baby.

Mario’s Overworld Theme has an unfailing optimism, and for the first time, aesthetics were becoming a high priority. The depth of aesthetics is expanded further through the various ‘movements’ of each level. Deep underground, via a pipe, the dungeon of the second level uses growling, dark and mysterious minor-keyed motifs that impact the gamer’s sense of doom.

The water levels also deserve a mention. The ‘waltz’ is so reminiscent of a Straussian/Viennese waltz (although not technically a Viennese waltz) that you can picture (if you try hard) a New Year’s Concert performance in Vienna, with the bourgeoisie clapping enthusiastically for Mario as he dodges the jellyfish. I don’t know where the connection between swimming and waltzes came from, and why they should be so suitably paired, but they definitely are, I assure you. Perhaps it’s the light but soft pulse of the waltz that is a perfect accompaniment for swimming. It captures the mood of submerged Mario. This is about as immersive as music can get in the 1980s. I know, I was there. And I remember thinking how submersive it was too!

Broadly, the whole architectural plan of the music, distinguishing each piece through its association with distinct environments, is a masterstroke, and works amazingly well. The music, if not everything else in the game, was a great breakthrough for video games. It showed that music can be paramount for the player’s aesthetic experience, and it raises the bar on what video game entertainment could be, not just for the time, but for all time.

VR, Adaptive Music and Storytelling  

Perhaps the next most notable innovation in terms or immersion was the introduction of Virtual Reality (VR) in the late 90s. This enabled a fuller immersion in the gaming experience, allowing virtual environments to be pretty much directly connected to the senses – the screen is literally clamped to your face. However, throughout this period, music in video games had a longer and more silent revolution through the gradual development of adaptive music. Adaptive music is that which changes with the ebb and flow of the developing environment and drama in a game. As Jennifer Smith writes, adaptive music in games aims to induce emotions through character-player connection. Such games are scripted to change according to the environmental conditions, such as the use of a full blown opera singer to heighten the narrative in NieR: Automata.

Procedural sound and adaptive music are now hot topics in industry and academia. We recently went to a meet-up on storytelling in VR. There, Joel Douek, the co-founder of Ecco VR, gave a particularly interesting presentation, examining the use of sound and music for storytelling.

One of the things Joel pointed out was the important role of sound in VR and its value for producing immersive music. The main things considered were how it’s necessary to match the sound with the visual space, and how sound and music directly affect the emotion for the gamer.

Sound and music channel the internal and external experiences, and help to enhance the ‘reality’ of the space. A sense of space within the music can be achieved through both compositional and production aspects, transpiring in how a player can be given an immediate sense of where they are within their surrounding environment. For example, listening to a human heartbeat within multi-speaker arrangements helps to give the player a heightened sense of reality about where they are in the game, and which adapts according to the game itself. These aspects are paramount for composers who require players to be immersed in the story of the game.

Deep Adaptive Music and its Benefits for Immersion

We’re currently on the verge of another video game music revolution, with what we at Melodrive call deep adaptive music (DAM). Now, and in the future, the video games industry is in a position to go back to synthesised sound, creating interesting music that is built from the bottom-up. Such music would be sensitive to the virtual environment, like adaptive music, but also aware of the emotional desires and disposition of the player, and in realtime, not simply using pre-recorded audio. All this can be done using the highest sound quality – no more cheap MIDI sounds! This is the future of video game music.

The five key features of DAM are as follows:

  1. DAM adapts on a granular level.
  2. DAM is infinite, depending on the unique path of the user.
  3. DAM can smoothly transition between emotional states.
  4. DAM is tailor-made to the user.
  5. DAM is generated in real time by cutting-edge AI.

DAM provides an open synthesis between the free development of the game and the emotional journey and disposition of the user. For many of us that are suspicious of our five-sense realities, DAM might be a really attractive blue pill with which we can anchor ourselves. And I’m not just talking about teenage nerds. To bring up Granny, as the late philosopher Jerry Fodor was notorious for (Granny believes that the mind is real), even Granny might approve of this technology because it doesn’t feel like ‘technology’ and it doesn’t feel like ‘AI’. It would feel natural, like all good technology should be. So Granny can go on believing what she likes, but not in normal reality.

So might this really be a new path in video game music evolution? This new technology must fit hand in glove with current games platforms yet be so much more, allowing an infinite music world within current virtual worlds. Suffice to say, we must have evolution, not revolution. To reverse Neil Armstrong, DAM is a giant leap from linear music, but is a more moderate step from adaptive music. DAM is important because it allows greater immersion and thus engagement for the player, matching the needs of current interactive arenas and game creation platforms. Thus, it will be a step forward that will be embraced by current gamers and VR users.

At Melodrive, we believe that DAM is the future. We’re developing a system that can achieve this and we do consumer research to understand it better. To look at the possible benefits of immersion, we’ve done our best to get inside the heads of gamers and VR users about how music is perceived and appreciated in games and VR experiences. Many of our surveys, whether focussing on general gamers or VR users, have found that most people value music as an important aspect of the video game experience, and have strong intuitions about how music should be used in various settings. Many gamers and VR users complain that the current state-of-the-art of music in video games is poor, and doesn’t meet their needs. They’ve said they’d like to see more flexibility for players to emergently interact with music. We aim to solve this, with our ultimate solution: a fully adaptive music solution that fits all purposes.


Trevor Rawbone is a music theorist with a PhD in music theory from Huddersfield University, UK, focussing on music schemata and grammars. He is interested in aspects of music cognition and music representations, and has published and presented his work in various academic fora. He has recently joined the Melodrive team to develop music models and representations for use in deeply adaptive music AI.