When it comes to virtual reality (VR) experiences, “immersion” and “immersion multipliers” are highly valued for players and developers alike, but what do they really mean? What exactly is immersion and why is it seen as the holy grail for VR? In this post we try to get to the bottom of what makes someone feel immersed, and provide some quick strategies you can use to make your VR experiences a lot more immersive – by only thinking about audio.

(Cover image: Shutterstock)

What is Immersion?

Immersion is a state of mind. It’s what happens when you lose the sense of your physical self in an artificial simulation, experience, or fiction, so the simulation appears real and you momentarily forget about the ‘real’ word. It doesn’t only happen in VR, but it might happen when reading a book or watching a film, or even (for me) when performing a difficult cognitive task, like writing some code. To put it another way, in a state of immersion your senses of presence and attention shift into the virtual world, and the more you believe the world you’re in, the more your brain convinces you that what you’re seeing is real.

When it comes down to it, it’s hard to define exactly what causes immersion. If you’re scientifically inclined like me, you will look to the the existing body of research into the area. There’s quite a lot of research out there, but no real consensus as to the components that lead to an immersive experience. And when the academics can’t agree you know you’re in trouble.

Staffan Björk and Jussi Holopainen’s 4 categories of immersion, as defined in their excellent book Patterns In Game Design, seems to me like a good way to start looking at immersion. It’s not exhaustive, but it’s a very practical way to think about immersion specifically in games and VR.

Björk and Holopainen’s categories are:

Sensory-motoric immersion
This occurs when you perform actions with your hands or limbs and get feedback through all of your senses – sight, sound and touch – and even smell and taste!

Spatial immersion
When the simulated world is perceptually convincing, you are said to feel spatially immersed, which directly affects your sense of presence. There are subtle nuances to this including the scale and environmental cues of the world (such as sound!).

Cognitive immersion
This occurs when you’re focused into a specific task that requires mental exercise – it’s perhaps the most common form of immersion as most people experience it when learning something new.

Emotional immersion
This is another common immersive state, often brought about when you watch a film or read an engrossing book, it occurs when players become emotionally invested in the experience.

Why is Immersion so Important for VR?

A lot of people think that the simple act of having some content in VR will give you a magic get a boost in “immersion”, but this is only true to a certain extent. While it’s definitely easier to achieve a state of immersion in VR, immersion is actually a fickle beast that’s difficult to maintain for a longer period of time.

During a VR experience your senses are being bombarded with a lot of conflicting information. In most experiences, players will have most of their senses covered. Their eyes completely covered, their ears in some kind of headphones, and their hands holding one or more controllers. Some companies add more to this setup, but this is the typical VR rig.

Even with this default sensory-motoric immersion, if something happens in the ‘real’ world, or the simulated experience conflicts with your senses, then the level of immersion can come crashing down – often with nasty side-effects such as disorientation, confusion and motion sickness. That’s definitely not a good thing!

As VR developers, we can’t fully control the external rigs and environment our players use, so instead we have to use in-experience tricks to coax the user into a deeper sense of immersion. The more a user is immersed, the more engaged they will feel in the experience, and the less likely they are to feel this cognitive dissonance between the real and simulated worlds.

Ultimately though if the experience is deeply immersive your players will feel happier and find themselves wanting to go back to the experience time and time again, for that “just a bit more” feeling. This is a key to the widespread adoption of VR.

How to Increase User Immersion with (Mostly) Audio

Since I’m a bit of an audiophile, I’m going to focus the rest of this post on some ways to increase the feeling of immersion with only audio techniques. I often feel that audio doesn’t get the attention it deserves in general, probably because, like a good film soundtrack, if the audio is working well it shouldn’t be noticable.

Well-implemented audio seamlessly blends into the experience and feels like the real thing, but when this goes wrong it causes strange things to happen in your brain that most people won’t be able to put their fingers on – they just know it feels “wrong”. Luckily there are a few simple things you can implement to avoid this!

Use Spatial Audio for Diegetic Sounds…

Diegetic sound is a term from film – it basically means the sounds made by real-world objects, characters etc. When you hear sound in the real world, your brain is doing a lot of complicated stuff to figure out where the sound is coming from and how far away it is. When this ability is taken away it provides an extremely disorienting effect. Most people won’t be able to figure out what’s wrong, but the world will just feel odd and fake. This problem boils down to a lack of spatial immersion, but by using spatial audio techniques you can eliminate this for your players!

Going back to film for a second, lack of spatial immersion was also a problem for film-goers, and this was the reason for the invention of surround sound. Even with a huge screen in a dark room, hearing audio out of two fixed speakers was a sure-fire way to remove yourself from the experience. The inventors of surround sound placed more speakers around the audience, so that sound can actually play behind you is a character is out of shot. The problem exists in VR, except we can’t just add more speakers to the headset… or can we?!

(Image: Mike Kim)

Seriously, you don’t need a sci-fi helmet! Spatial audio is a tech that’s been around for quite some time in games, and is really coming into its own in VR titles. With spatial audio, sound sources can be placed into the 3D world, and the engine figures out how to manipulate these sounds so they sound like they’re coming from that spot, even through a stereo mix. If you have ambient sound effects in-game, you should definitely spatialise them for their full effect.

…and Ambisonic Audio for Non-diegetic Sounds

Spatial audio is great for diegetic sounds that are visible on the screen, such as a babbling brook or a gunshot, but when it comes to non-diegetic sounds – like the game’s soundtrack – we run into another issue. It’s not often desirable to have the music coming out of fixed points in the 3D world, but instead the player should feel enveloped by the music. However, if the music mix is fixed, (e.g. the guitar is always in front of you, no matter where you turn), then this can cause feelings of uncomfortableness that lead to a break in sensory-motoric immersion.

To tackle this, we can use ambisonic audio. This is where we have a set of points around the listener in 3D space, but relative to the listener rather than in absolute positions. The 3D points move with the listener, but changes in the mix can be heard when the player turns around or moves their head. This simulates the experience of being in a real-life environment, almost as if you were at a real-life concert standing in the center of an orchestra!

(Image: Amp Audio)

Ease the Learning Curve with UI Audio

One of the most painful experiences in VR that’s a sure-fire way to bring your player out of the experience is to have a complicated control scheme for them to learn. Most players will struggle with pressing more than one button at once, especially if it’s an unnatural gesture. This is an area where VR actually shoots itself in the foot a little bit. Since players can’t see their hands, it’s even harder to learn the new controls.

The control scheme will of course affect sensory-motoric immersion, but also has a strong effect on cognitive immersion. Somewhat counter-intuitively, hard-to-learn controls do not lead to cognitive immersion but actually hinder it. This is because in order to feel cognitively immersed you also need to feel in control of your own actions and therefore get less frustrated!

But don’t fret, audio is here again to save the day! Having simple audio cues that respond to the user’s actions is a method of positive-reinforcement. It aids cognitive immersion as it reduces the frustrations of not knowing if you’re even doing the right thing or pressing the right buttons.

An excellent example of UI audio done well in VR is Funomena’s Luna. With a very simple control scheme coupled with a few effective gong sounds providing audio feedback, the player is eased into the controls and feels more engaged as a result.

(Image: Wired)

Use (Deep) Adaptive Music for your Soundtrack

We all know that music has an amazing power to convey emotions, and you might have noticed that this is used in the real world everywhere you turn. When was the last time you watched a advert on TV with no music, for instance? In VR though players have their own agency and behaviours – they can choose where to go and how to act. This means that if we want to increase their emotional immersion, we need to be able to dynamically respond to what they do with the music. This is called adaptive music, and is so fundamental to interactive media that leading video game composer Guy Whitmore says that it’s the only possible way to score it.

One problem with adaptive music is that it tries to score potentially infinite interactive content with a finite amount of musical content. This essentially creates an “average” soundtrack, which is not tailored to a specific user. This can be hugely detrimental to emotional immersion in VR, as the player needs to feel like they’re really there and so they need their own personalised soundtrack to match.

At Melodrive we’ve developed a new way to score interactive media, called Deep Adaptive Music (DAM). DAM utilises the amazing power of AI to score the events happening in VR on a granular level. This means that the music can seamlessly transition from any emotional state to any other emotional state, in a meaningful and smooth way that is unique to each player’s style.

In our research we’ve found that switching to DAM can boost user engagement and immersion by up to 40%, due to the increased emotional immersion that DAM brings to the experience.

So there you have it, some textbook suggestions for how to improve immersion in VR. Next time you hear the phrase “immersion multiplier”, try to ask yourself which one of the four categories it addresses. Do you have your own ways to increase immersion in your experiences? Let us know in the comments below!