Film mixing is the process of combining the different sonic elements for film into a final SoundBed.
Loudness of Content.
There are many loudness standards nowadays and many types of media and platforms so making sure audio is on the correct level everywhere can be tricky. In this post, I’m going to talk about the history of measuring loudness and the current standards that we use nowadays.
In the analogue days, reading audio levels always implied measuring voltage or power in a signal and comparing it to a reference value. When trying to determine how loud an audio signal is, we can just measure these values across time but the problem is that levels are usually changing constantly. So how do we best represent the overall level?
A possible approach would be to just measure the highest value. This method of measuring loudness is called Peak and is handy when we want to make sure we are not working with levels above the system capacity to make sure our signals are not saturated. But in terms of measuring the general level of a piece of audio, this approach can be very deceiving. For example, a very quiet signal with a sudden loud transient would register as loud despite being quiet as a whole.
As you are probably thinking, a much better method would be to measure an average value across a certain time window instead of the instant reading that peak meters provide. This is usually called RMS (root mean square) metering and it is much closer to how we humans perceive loudness.
Let’s have a look at some of the meters that were created:
VU (Volume Unit) meters are probably the most used meters in analogue equipment. They were designed in the 1940s to measure voltage with a response time similar to how we naturally hear. The method is surprisingly simple: the needle’s own weight slows down its movement by around 300 ms on both the attack and the release so very sudden changes would be soften. The time that the meter needs to start moving is usually called the integration time. You will also hear the term “ballistics” to define these response times.
The PPM (peak programme meter) is a different type of meter that was widely used in the UK and Scandinavia since the 1930s. Unlike the the VU meter, PPM uses very short attack integration times (around 10ms for type II and 4ms for type I) while using relatively long times for the release (around 1.5 seconds for a 20dB fall). Since these integration times are very short, they were often consider quasi-peak meters. The long release time helped engineers see peaks for a longer time and get a feel of the overall levels of a programme since levels would fall slowly after a loud section.
The Dorrough Loudness Meter is also worth mentioning. It combines a RMS and a peak meter in one unit and was very common in the 90s. We will see that combining a RMS and peak meter in a single unit was going to be a trend that will carry on until today.
As digital audio started to become the new industry standard, new ways to measure audio levels needed to be adopted. But how do we define how much 0 is in the digital realm? In analogue audio, the value we assign to 0 is usually some meaningful measure that help us avoid saturating the audio chain. These values used to be measured in volts or watts and would vary depending on the context and type of gear. For example, for studio equipment in the US, 0VU corresponds with +4 dBu (1.228 V) while europe’s 0VU is +6 dBu (1.55 V). Consumer equipment uses -10dBV (0.3162V) as their 0VU. As you can see, the meaning of 0VU is very context dependant.
In the case of digital audio, 0dB is simply defined as the loudest level that flows through the converters before clipping, this is, before the waveform is deformed and saturation is introduced. We call this definition of the decibel, dBFS (Decibel Full Scale). How digital audio levels correspond with analogue levels depends on how your converters are calibrated but usually 0VU is equated to around -20dBFS on studio equipment.View fullsize
TV Regional and National Standards
Now that we have a good idea of all the different characteristics standards use, let’s see how they compare.
|Country / Region||Standard||Units Used||Integrated Level||True Peak||Integrated level method|
|Europe||EBU R128||LUFS||-23 LUFS||-1 dBTP||Relative Gate|
|US||ATSC A/85 post 2013||LKFS||-24 LKFS||-2 dBTP||Relative Gate|
|US||ATSC A/85 pre 2013 (Commercials)||LKFS||-24 LKFS||-2 dBTP||All elements are considered|
|US||ATSC A/85 pre 2013 (Programmes)||LKFS||-24 LKFS||-2 dBTP||Dialogue Detection|
|Japan||TR-B32||LUFS||-24 LUFS||-2 dBTP||Relative Gate|
|Australia||OP-59||LKFS||-24 LKFS||-2 dBTP||Relative Gate|
Loudness for Digital Platforms
I have tried to find the specifications for some of the most used digital platforms but I was only able to find the latest Netflix specs. Hulu, Amazon and HBO don’t specify their requirements or at least not publicly. If you need to deliver a mix to these platform, make sure they send you their desired specs. In any case, using the latest EBU or ATSC recommendations is probably a good starting point.
In the case of Netflix, their specs are very curious. They ask for a integrated level of –27 LKFS and a maximum true peak of -2 dBTP. The method to measure the integrated level would be dialogue detection, like the ATSC used to recommend, which in a way is a step back. Why would Netflix recommend this if the ATSC spec moved on to gated based measurements? Netflix basically says that when using the gated method, mixes with a large dynamic range tend to leave dialogue too low so they propose a return to the dialogue detection algorithm.
The thing is, this algorithm is old and can be inaccurate so this decision was controversial. A new, modern and more robust algorithm could be a possible solution for this high dynamic range mixes. Also, -27 LKFS may sound too low but it wasn’t chosen arbitrarily but based on the fact that that was the level where dialogue would usually end up on these mixes. If you want to know more about this, you can check this, this and this article.
Loudness for Theatrical Releases
The case of cinema is very different from broadcast for a very simple reason: you can expect a certain homogeneity in the reproduction systems that you won’t find in home setups. For this reason there is no hard loudness standard that you have to follow.
|Dolby Scale||SPL (dBC)|
This lack of general standard has resulted in a similar loudness war to the one in the music mixing world. The result are lower dynamic ranges and many complains about cinemas being too loud. Shouldn’t cinema mixes offer a bigger dynamic range experience than TV? How are these levels determined?
Cinema screens have a Dolby box where the projectionist would set the general level. These levels are determined by the Dolby Scale and correspond to SPL measures under a C curve when using the “Dolby noise”. Remember that, in the broadcast world, the K curve is used instead which doesn’t help things when trying to translate between both.
Nowadays more and more cinemas are automated. This means that levels are set via software or even remotely. At first, all cinemas were using level 7, which is the one recommended by Dolby but as movies were getting louder and people complained, projectionists would start to use lower levels. 6, 5 and even 4.5 are used regularly. In turn, mixers started to work in those levels too which resulted in louder mixes overall in order to get the same feel. This, again, made cinemas lower their levels even more.
You see where this is going. To give you an idea, Eelco Grimm together with Michel Schöpping analyzed 24 movies available at dutch cinemas and found out levels that would vary wildly. The integrated level went from -38 LUFS to -20 LUFS, with the maximum Short-term level varying from -29 LUFS to -8 LUFS and the maximum True-Peak level varying from -7 to +3.5 dBTP. Dialogue levels varied from -41 to -25 LUFS. That’s quite a big difference, imagine if that would be the case in broadcast.
The thing is that despite these numbers being very different, we have to remember that all these movies probably were played at different levels on the dolby scale. Eelco says on his analysis:
- The average playback level for movies mastered at ‘7’ is -28 LUFS (-29 to -25).
- The average playback level for movies mastered at ‘6.3’ is -23 LUFS (-25 to -21). They are projected 3 dB softer, so if we corrected the average to a ‘7’ level, it would be -26 LUFS.
- The average playback level for movies mastered at ‘5’ is -20 LUFS (all were -20). They are projected 7 dB softer, so the corrected average would be -27 LUFS.
So, as you can see, at the end dialogue level is equivalent to about -27 LUFS in all cases, the only difference is that the movies that were mixed at 7 (which is the recommended level) would have greater dynamic range, something important to be able to give a cinematic feel that TV can’t provide. The situation is quite unstable and I hope a solid solution based in the ITU recommendations is implemented at some point. If you want to know more about all this issue and read the paper that Eelco Grimm released, check this comprehensive article.
Loudness standards for video games.
Video games are everywhere: consoles, computers, phones, tablets, etc, so there is no clear standard to use. Having said that, some companies have stablished some guidelines. Sony, through their ASWG-R001 document recommends the following:
- -23 LUFS and -1dBTP for Playstations 3 and 4 games.
- -18 LUFS and -1dBTP for PSVita games.
- The maximum loudness range recommended is 20 LU.
But how do you measure the integrated loudness in a game? Integrated loudness was designed for linear media so Sony’s document recommends to make measurements in 30 minutes sessions that are a good representation of different sections of the game.
So, despite games being so diverse in platforms and contexts using the EBU recommendations for consoles and PC (-23 LUFS) and a louder spec for mobile and portable games (-18 LUFS) would be a great starting point.
I hope you now have a solid foundation of knowledge for the subject. Things will keep changing so if your read this in the future, assume some of this information is outdated. Nevertheless, you would have hopefully learned the concepts you need to work with loudness now and in the future. You can keep in touch with us with the contacts below:
Whatsapp: +256 392 002051