Sage Audio

Reverb Loudness Relative to Dry Loudness

According to a paper published in 2014 titled ‘Intelligent Audio Production Strategies Informed by Best Practices,’ the preferred reverb level for 30 evaluators is 9LU lower than the dry signal.

They also found that listeners disliked higher reverb levels when in the context of popular music, especially when the reverberated signal was the same loudness as the dry signal.

I’m curious which you all prefer, so let’s try this out on vocals with the reverb set at -15LU, -12LU, -9U, -6LU, -3LU, and at the same loudness as the vocal. Let me know which one sounds the best to you - in my opinion reverb level is a relative thing, but I’m curious what you all think.

Watch the video to learn more >

Frequency Specific Peak to RMS Ratios

According to the same paper, after measuring over 900 commercial #1 singles, there’s a clear pattern regarding the dynamic range of each frequency range.

Overall, low-frequencies are more dynamically controlled than high frequencies.

Across the 8 octaves, ranging from 20Hz-20kHz, the most control occurs on the sub and lows with a peak to rms ratio of about 16-1. Gradually this increases to about 17-1 to 18-1 for the next few octaves, before we’ll notice a more significant increase in the dynamic range for the highest 3 octaves.

Let’s try this out on vocals using frequency specific compression. I’ll achieve the same crest factor or peak-to-rms ratio as what’s shown here with the exception of the lowest frequency range, which I’ll attenuate with a HP instead.

Then I’ll apply ratios that don’t correspond to what’s shown in the paper to be the ideal dynamics. Let’s listen to these vocals soloed and in the context of an instrumental. For the instrumental, I’ll adhere to the recommended dynamic ranges.

Watch the video to learn more >

Vocal Loudness Relative to the Instrumental

One interesting finding in this paper is the ratio of the lead vocal’s loudness relative to everything else. They suggest a vocal should be equal in loudness to the rest of the mix.

So if the instrumental is -23LUFS, the vocal should be -23LUFS, and so on.

If the instrumental dips in loudness during a particular passage, a vocal rider should be used to attenuate the vocal’s loudness, and vice versa.

So, let’s take a listen to a vocal set to the same loudness as the instrumental. I’ll vary the level of the vocal’s level, moving it above and below this mark, and let me know if there’s a particular setting that sounds the best to you.

Watch the video to learn more >

Gestalt Theory and The Rule of Common Fate

In a 2016 paper titled ‘Gestalt Theory and Mixing Audio,’ author Matthew Shelvock proposes that a person’s tendency to group related and semi-related information into simplified and meaningful groups has huge implications when mixing music.

For example, Gestalt’s ‘Rule of Common Fate’ states that stimuli are grouped when they experience similar changes. So, say I have 3 vocal tracks - 1 is the lead and 2 are BGVs that don’t follow the same melody as the lead.

If they start and stop at the same time, or just stop at the same time, a listener will group these 3 distinct signals as a cohesive unit.

This may give us some insight as to why bus processing is so effective at creating cohesion - by affecting multiple signals with the same change so to speak, we cause the listener to construct a meaningful connection between them.

Following that idea, let’s listen to vocals that have been processed individually, and then vocals that are collectively processed. I’ll alter the processing to ensure the overall amount of change is as close as possible, but let me know which example sounds more cohesive.

Watch the video to learn more >

Mixing Vocals and the Rule of Good Continuation

In the same paper the author offers this illustration - notice that the first perception is that of 1 straight line and an interesting curved line; however, when we take a moment, we’ll notice it can also be perceived as a black and a red line, both of which converge, and move in different directions.

As it relates to audio, this suggests that the perceived pattern will emerge from the seemingly simplest or smoothest change.

Which makes sense, if I have 2 clips from a vocal and I want to create the perception of continuous performance, then a crossfade is often a good choice. It simplifies the listener’s experience by offering reduced perceivable change.

Creatively speaking, this gives mixing engineers good reason to create drastic changes to combat this tendency.

For example, say I have a lead vocal, and a counter melody. Already the 2 vocals will sound distinct due to variations in the performance; however, if their timbres match closely, it may become difficult for listeners to identify both.

Now, let’s say I keep the original vocal as is, but I heavily distort and equalize the counter-melody. By creating multiple layers of perceivable change we take away the listener’s ability to group the 2 vocals together.

To show this, I’m going to record a single vocal line. Notice that if I separate the clip and apply unique processing to the separated part, I’ve created 2 distinct identities - the performance is no longer one continuous performance but 2 performances delineated by their unique timbres.

Watch the video to learn more >

Testing New Academic Research - Improved Vocal Production

Reverb Loudness Relative to Dry Loudness

Frequency Specific Peak to RMS Ratios

Vocal Loudness Relative to the Instrumental

Gestalt Theory and The Rule of Common Fate

Mixing Vocals and the Rule of Good Continuation