Not more than a Herculean stone throw away from the wonderful and iconic St Thomas Church, where Bach composed tons of sleek cantatas, lies the stately Hochschule fuer Musik und Theater, the regal setting of the video game musicology conference, Ludo2018. This is one of the field’s foremost annual meets, and took place last weekend (13-15 April) in Leipzig. The Ludo conference was errichtet by the first (and in my opinion, the best) research group in ludomusicology, the Ludomusicology Research Group, the Venerable Elders of which, Michiel Kamp, Tim Summers, Melanie Fritsch, and Mark Sweeney, drop scholars in wonderful locations in the UK and Europe to study what anyone in their right mind would love – music in video games, its history and its impact. Leipzig is a place of great historical, scientific, and artistic significance. Let’s see: Bach, Mendelssohn, Wagner, Goethe, Leibniz, etc., all have strong associations with this place (and ‘etc.’ means a lot in this context).

Bach’s St. Thomas Church

The Use of Leitmotifs

The Wagnerian connection may be particularly apt for ludomusicology owing to the wide use of leitmotifs in classic and modern video games music. Leitmotifs are musical themes associated with characters, places, or ideas, and probably invented by Wagner, the composer and inventor of the epic musicdrama. Leitmotif was a subtle but consistent theme of Ludo2018; I was listening out for it, as it were. The idea of leitmotif underpins many of the fashionable concepts in the field, such as adaption, embodiment, and immersion, all of which concern the connection between music and the internal and external gaming experience.

Traditionally, leitmotifs have been used to create engagement, such as in games like the Legend of Zelda, but are also intrinsic to current adaptive and interactive video games music too (Just check out NieR: Automata, 2017). If you listen to the main theme of Zelda for the NES, the heroic melody becomes associated with the main character, Link, providing a heroic emotional signature that can be recalled in the game’s future. When used again it can be varied, changing the meaning in the process. Thus leitmotifs are hugely important for creating congruence between the game environment and music, a concept that I’ll talk about a lot about in due course. Check out the original version of the Link theme and the following fairly competent leifmotif analysis:

And its analysis:

Dynamic, Adaptive and Interactive Music

Before proceeding, I should define some key ludomusicological terms. Tim Summers (one of the Venerable Elders of Ludo-musicology, and who presided over Ludo2018 with Melanie Fritsch), wrote ‘the book’ on video game music, called Understanding Video Game Music (2016). In it he outlines the main research traditions, goals, and prospects of ludomusicology, identifying some of its key terms (referencing Karen Collins’ glossary, from Game Sound 2008):

  • Dynamic music is a generic term for changeable music.
  • Adaptive music changes in reaction to the game state, but not directly as a cause of the players’ actions in the game.
  • Interactive music changes according to player motivations.
  • Procedural and generative music are forms of interactive music that are developed in real-time using algorithms that recombine music to produce new scores on the fly.

As might have been expected, Ludo2018 covered these hot topics, as well as others, such as integration, VR, entrainment, and immersion, which I will discuss shortly. One such hot topic in the conference was adaptive transitions in video game music, which permits greater immersion in the internal and external gaming experience. Andy Elmsley, one of the founders of Melodrive and a developer of its AI, in an earlier blog, has described immersion as when “your senses of presence and attention shift into the virtual world, and the more you believe the world you’re in, the more your brain convinces you that what you’re seeing is real”. Tim Summers has put forward a comparable definition of immersion in the aforementioned bible of ludomusicology, Understanding Video Game Music (2016, pg. 58):

As players, we want games to be more than interaction. We long for those times when a game is is so absorbing, we lose track of time and our surroundings — we prize games that give us that rewarding sense of deep involvement in a fictional construct and gameplay mechanism.

In her talk, “The Blurring of Worlds: The Soundscape(s) of NieR: Automata”, Jennifer Smith, discussed the adaptive transitions of the soundtrack, noting how lucid soundscapes alter smoothly depending on location, thereafter examining the transitioning between in-game environments that lead to greater immersion (see also her blog post). Jennifer found that the composer builds the soundscape with an eye to the players’ progress within the narrative of the game, which enables smoother transitions and thus greater immersion.

Adaptive Music in Google Earth VR

Michiel Kamp (one of the Venerable Elders) in his talk on dynamic music, explored how music changes according to various environmental contexts in Google Earth VR. Keeping up with the times, the boys at Google have designed music that adapts to various environmental contexts, as well as users’ proximity with virtual earth. The composer, Joshua Moshier, was commissioned to create some pretty subtle ambient music that could no way under any circumstance be considered invasive. Heaven forbid. This is towards the end of producing more interactive and immersive VR experiences — a big improvement on the 2D-‘behind glass’-keep-your-mouth-shut approach of yesteryear, you might say. Michiel suggests that this “ecological” way of listening is absolutely key for VR, with affordances remediated by film and video game music, acknowledging the crossover between media that is the revolution of modern video games.

However. Shock-horror. Not everyone is so impressed with the musical and visual combo in Google Earth VR. Cale Hunt, in his article for vrheads.com, expresses disappointing doubt about the coherence between its visuals and soundscape:

Google Earth VR does come with music and some sound, but it doesn’t always apply to what you’re seeing. For example, you’ll often hear birds chirping and calm music playing while you’re standing in the middle of downtown Manhattan. One of the best things to do for extended periods of play is set up your own themed music.

While Cale acknowledges that Google Earth VR allows users to import their own music from YouTube, his assessment echoes our intuitions and conducted research at Melodrive, that gamers and VR users are often unhappy with game and VR music, preferring to bring their own music to the party. Either they don’t like the game soundtrack or don’t think it’s appropriate. We’ve heard this complaint so much at Melodrive that when we hear it at the office we always sing in chorus: “Gamers want personalised music that fits the game!” And then we fall about laughing because everyone knows it, but nobody does it. Nevertheless, I think that we all think that Google Earth VR goes some way towards providing ‘appropriate’ music, but perhaps doesn’t go far enough. Let’s hold our horses for a second. We at Melodrive are not here to critique the music of Google Earth VR, because in any case it has a rather specific function that is different to our own goals. What I’m trying to say is that while it might provide a somewhat dynamic solution, it is not truly adaptive or even deeply adaptive, which are things we prefer to give top billing at Melodrive.

I gave a presentation at Ludo2018 exhibiting the Melodrive engine in all its glory. At Melodrive, we are developing a system that creates a new type of procedural/interactive music, which we term deep adaptive music (DAM). Our system builds on current systems, creating more fully adaptive experiences in gaming and VR, with richer soundscapes and deeper entrainment and immersion. Many adaptive and procedural music in games have been developed over the last decade, such as Spore (2008), Red Dead Redemption (2010), and NieR: Automota (2017), that while brilliant, still rely on various degrees of looped material and pre-prepared musical building blocks. Therefore, procedural and generative music can sound, if not bland then sometimes undirected and without the magic that we quietly demand from music. As humans, we want music to sound fresh, having that ‘just improvised’ feel. Yet it must sound purposeful and intelligently designed (if you’ll forgive the expression), and have the swish of a brand new car; like you’ve just stepped into a saloon, as people used to say in the 80s.

A perennial problem for adaptive/procedural music is that it is difficult to implement and is often not suited to musical strategies that use highly structured and organised melodic and accompaniment material. The simple reason is: you can’t adapt musically when you’re in the middle of a melody. The reason being is that a melody is holistic and self-contained. However, Melodrive’s DAM develops on a granular level, having infinite possibilities of combinations that can cohere with the infinite player paths possible in a game. We can create superhuman music because we are not chastised by melody, harmony, and metre; by contrast, the Melodrive engine embraces them. We have worked out ways that these concepts can be changed subtly and congruently on the fly in accordance with the fine-grained internal and external emotional contexts of a game.

On Musical Congruence

Richard Stevens, of Leeds Beckett University, gave an enviably slick talk on indeterminancy and congruence in video game music, pointing out just this problem of matching adaptive/procedural music to the ongoing action in games. He discussed how the relative congruence between music and internal and external events can be achieved. Examining the notion of congruence between these elements in illuminating detail, he noted that indeterminacy is required in music when there is an aesthetic preference for smoothness between transitions. At these points game music should be ambiguous within a parameter, such as harmony (harmonic ambiguity), metre (metrical ambiguity), and thematic (short ‘motifs’, i.e., small melodic units), because they avoid determined ‘goals’.

Richard’s theoretical explication of congruence drew on a number of eminent sources, such as Annabel Cohen’s Congruence and Association Model (Cohen 2015), which theorises on the congruent relations between film and music. This is a broadly similar idea to the concept behind the Melodrive engine (which I’ll explain in not-too-much detail shortly). Her hypothesis is that the most congruent elements in the film and its music are attended to by cognition, and thus become associatively binded (Cohen 2015, pg. 10). This rings true. We only have to watch the movie Jaws to see that a prerequisite for the expectation and suspense is the associative connection between the big musical theme and the big Great White. Even when the shark is not in the shot, the leitmotif fills in the blank, and that blank is big. In short, the elements are associated in psychology as a result of congruence.

With our understanding of Cohen’s theory of congruence we can see further into the issues surrounding adaptive music. Richard said that he generally advises students that in order to write music with smooth transitions, there must be congruence between visuals and music. Good advice, you might say. He therefore argues that it is necessary that there be no ‘vector’ parameters (i.e., goal-oriented features) such as clearly defined harmonic units, metres, or themes, which he thinks would make transitions abrupt. As a solution, Richard points towards various techniques that are both vectorised and temporally ambiguous, such as the Shepard-Risset glissando, in games such as Wolfenstein: The New Order (2014) and Doom (2016). Here is the soundtrack of Doom (2016). Worth listening to in full if you have the time:

Mike Gorden, the composer of Doom has discussed in interview the difficulty of developing expanses of pre-planned music that can fit the outcomes and possibilities of a game to enable smooth transitions and deeper immersion:

It’s crazy really, I have these really massive sessions of music which can be around 20 minutes or 40 minutes long. What’s in there is a whole series of possibilities. An example of one of these will be, “Oh, you’ve just stopped shooting and it’s mid-verse, what does the music do?” Then that happens. Another example will be, “What if you stop shooting and a giant boss jumps out?” That’s its own possibility again. Another could be, “What happens if you’ve gotten up to a boss and you’ve paused the game?” When you start on a project, you basically define these possibilities and then write music around them, it’s always pre-planned.

An issue with Doom (2016) is that while it is a great adaptive soundtrack that exhibits smooth transitions, it doesn’t provide truly interactive music (although it doesn’t try to), because its use of recombined pre-planned musical units depend only on the game development, and not so much on the motives of the user. As already noted, this approach is limited for providing freshness and immersion in a game. More broadly, while the approach of creating indeterminacy and avoiding vectors, as well as having a catalogue of pre-prepared music, may be satisfactory solutions, we at Melodrive feel it is a shame for music to lose such interesting goal-oriented concepts such as melody, metrical structure, and harmonic structure, because these overarching structures provide a lot of interest in music. This is not too much of a problem with Doom, which is not short of a good melody or two. But let’s be honest, we wouldn’t want every game to be like Spore (2008), with its sporing approach to musical development that in lesser hands could create ambient goulash.

Richard has raised an important challenge that we think that Melodrive could solve in spades: generating interesting musical concepts (such as melody and metre) which are fundamentally goal-directed vectors but that can withstand transitions between various internal and external game environments. This issue of congruence between music and game environments, which is a nightmare-concern for ludomusicologists, is now looking smaller and smaller, since, as you might have guessed by now, congruence between visual and musical elements has been all but solved through DAM. Through DAM, we keep congruence between the internal and external environment and regular musical concepts, like melody and harmony, while allowing granular flexibility.

Cohen’s (2015) notion of congruence is in line with my own published work (Rawbone and Jan 2015; Rawbone 2017), where I’ve developed the theory of the Tendency for Congruence for pure musical structure. Broadly, this is the idea that there is a propensity for cognition to generate structures where the musical features run towards congruence. This might occur for various reasons, but perhaps the most profitable explanation is that congruence is actually necessary for meaning in music. I argue that if musical elements weren’t congruent with each other, we wouldn’t be able to understand music at all. That is, if the elements of music were totally noncongruent, no part would be by definition relatable to the next, and so it would be impossible for such music to be meaningful. If this story about musical meaning turns out to be true, it is no wonder that we think that congruence between internal and external environments of a game and its music is aesthetically preferable, because, more than this, it is actually fundamentally necessary for meaning — let alone being preferable.

Final Thoughts

In closing, Ludo2018 has treated us delegates to a glimpse of the richest seams of video games music scholarship currently being worked by ludomusicologists. The conference has been characterised by James Hannigan (renowned video game composer) as being dedicated to ‘providing a meaningful dialogue on the aesthetics of games music between industry and academia’. (He mentions this conference specifically, but his sentiments extend to other avenues of the field). James goes on to say that that the emergence of Ludo-musicology has reaffirmed something he’d always thought to be true, ‘that video games music can be unique and is eminently worthy of study and analysis’. No-one of course can deny its importance who has any knowledge of classic video game music and the current-day wonders of interactive music. A bonus point for this conference is that it will increase the academic and popular cultural standing of video game music across the globe. So maybe one day we will mention Koji Kondo in the same breath as Paul McCartney or Frederic Chopin. And without it even sounding a little forced. There are probably already undergraduate courses in video games music, but the acid test for knowing when it has ‘arrived’ is when it is taught at school, where everyone gets a piece of the action. When that happens, the Venerable Elders can sit back and puff on a cigar knowing they did their bit to promote art.