Before I begin to affect my vocal, I want to get a rough level for everything. For this mix, we’ll start with something that doesn’t have any processing. I’m going to do a simple mix on the instrumentation including some EQ and mild saturation, and maybe a little compression where needed, but I’ll leave things alone for the most part.
What I want to focus on in this part of the video though, is establishing a general level between all of my instruments and the vocal. It’s helpful to know where you want everything to sit before you begin to add your processing.
This way we can create proper gain staging between our multiple processors on the vocal. I’m going to have the vocal sit a little on top of the instrumentation, but depending on what you’re trying to achieve, you may want it a little lower than the rest of the mix.
One more thing I want to mention is the routing. Granted I know this isn’t just about the vocal, but notice how I’ve grouped everything. I’ve grouped and color-coded various instrument groups, and then sent these groups to a send by changing the output from the stereo-output.
This way each instrument group, like the drums, guitars, etc, have an auxiliary track that can affect the full level of that group. So, if I need to quickly mute the drums, I could do that, and so on.
This helps a lot when trying to establish levels at the start of the mix, then it lets you quickly adjust things throughout the mixing process.
Let’s take a listen to the track we’re working on. I’ll solo the vocal for a moment, then introduce the mix, and notice how listening to the vocal in the context of the mix changes how it sounds.
Once we have some rough levels for the vocals, and we know if we want it to be a little above the mix, or buried a little underneath, we should edit the vocals with clip gain.
Clip gain affects the signal prior to any processing - so if we make the vocal more dynamically balanced with clip gain, our processors can react to it in a more uniform way.
This is admittedly the most time-consuming part of mixing a vocal, but it definitely improves the sound.
I like to listen to the full vocal and take mental note of parts that are difficult to hear. Then, I’ll make my way through the vocal and isolate various parts of the vocal that I want to amplify or attenuate.
I’ll want to attenuate any plosives or pop sounds, as well as attenuate any breaths that are too loud. Granted, I don’t want to remove these completely, since removing breath from the vocal will make it sound unnatural.
And removing sibilance entirely will make the singer sound like they have a lisp, so I’ll have to find a good medium.
Meanwhile, I’m going to find passages that seem too low or too high in amplitude - fortunately, we have a visual for this, but it also helps to use your ears. Then we’ll isolate the section, amplify the signal if it’s too quiet, or attenuate it if it’s too loud.
Let’s take a listen to the vocal balanced with clip gain, and notice how sibilance becomes less noticeable, how plosives are controlled, and how the overall level sounds more balanced.
For the vocal’s first processor, let’s use an EQ to remove some unmusical frequencies - by unmusical I mean anything that’s not related to the performance. For example, plosives, rumble, or hum can be attenuated by using a high pass filter.
I’ll use this Pro-Q 3 but a good free alternative is MEqualizer by Melda Audio.
A 12dB/octave filter will avoid aggressive phase changes that would otherwise amplify some unwanted frequencies - with it, we’ll attenuate to right before the fundamental, or, the vocal’s lowest frequency note.
If we want to attenuate a little of the lows, we could also use a 6dB/octave filter and set the center frequency a little lower.
The slope will gradually attenuate low frequencies, especially around the vocal’s fundamental and more muddy-sounding frequencies.
While we’re on the topic, let’s consider some other aspects of the vocal that can be attenuated at this point. 250Hz is often a good starting frequency to attenuate by a couple of dB. The reason is, there’s almost always a lot of energy in this range that masks higher, more clarifying frequencies.
Attenuating it by a couple of dB will increase the vocals clarity, and also result in subsequent processors like compression, saturation, and so on, not having to work as hard due to the lower overall amplitude.
If you notice any other frequencies in the vocal that need attenuation, now would be a good time to handle it. If the vocal is sounding nasally, attenuate between 700Hz and 1200Hz, wherever you hear a build-up of frequencies.
Keep in mind you’ll need to do this in the context of listening to the full mix since the frequencies in the mix will affect the sound of the vocal.
We could attenuate some of the sibilance at this point as well if it’s excessive. I’ll save this for a de-esser later on in the chain, so for now, let’s leave this range alone.
I can’t tell you exactly what to attenuate, since each vocal and mix are different, but this should be a good starting point.
Let’s listen to the vocal being attenuated with EQ, and notice how it sounds a little more balanced.
As I was saying in the last chapter, EQ isn’t always the best way to attenuate the vocal before additional processing - it’s great in a static way, but if we need that attenuation to be dynamic, then a multi-band processor is more helpful.
In this particular example, I’ll use a frequency-specific compression usually called a de-esser to attenuate the vocal’s excessive sibilance. I’ll use this FabFilter De-esser, but T-De-Esser is a good free option, or you could just use your DAW’s stock plugins.
If I was to attenuate this range with an EQ, then the attenuation would be occurring constantly, which may change the timbre of the vocal in a way that I don’t want. But with a de-esser, I can cause attenuation to the vocal’s sibilance whenever it becomes loud enough, but leave the range alone whenever it isn’t causing an issue.
Finding the exact range to attenuate can be a little challenging - using your ears is always the best method, but a frequency analyzer is also helpful. Since we used an EQ as the previous processor, we could just observe the range and pay attention whenever spikes occur. Then, set the range for our de-esser to this range.
I’ll attenuate as much as needed but try to avoid more than 6dB of attenuation. If de-essing is used too aggressively, the vocal could lose needed detail, which might make it harder to make out the words.
Let’s listen to the vocal being de-essed, and notice how it balances the vocal’s high range, without being too overbearing.
Now that the vocal is balanced and doesn’t have aggressive unwanted frequencies or any unmusical sounding aspects, let’s introduce some compression.
I’ll use this Pro-C 2 but try MCompressor if you need one, or use your DAW’s stock plugin - whichever plugin you like most will work well.
Compression is typically thought of as a form of attenuation, but the purpose of compression in most cases, is to lower the peaks so that we can amplify after the compression. This way we bring quieter details up while still having enough headroom.
If we still had a lot of sibilance or muddy frequencies, then these would be amplified along with the aspects that we want - since we attenuated them first with EQ and de-essing, we amplify more of the desirable aspects of the vocal.
Additionally, since we used clip gain to balance the vocal first, we won’t need aggressive attenuation to control our peaks.
So, let’s introduce compression - typically peak down compression with a 10ms attack and 50ms release to cause quick attenuation that doesn’t add unwanted distortion.
If the compressor you’re using has lookahead, use 2ms of it - this way more of the vocal is captured, causing a more consistent sound. I like to use auto-make-up gain, but if yours doesn’t include this function, use the output or make-up gain function to amplify post-compression.
A 4:1 ratio with a slightly softer knee is a great starting point, then adjust as needed, and carefully set your threshold. Depending on what you’re trying to achieve, the genre, and so on, you’ll want between 2 and 6dB of attenuation. For a more modern pop sound, use closer to 6dB, for a more natural sound that retains some dynamics, achieve closer to 2dB of attenuation.
Let’s take a listen to the vocal before and after compression, and notice how it sounds louder, more consistent, and is easier to discern from the mix. Then I’ll adjust the output to bring the level back to where I had it originally.
Saturation is a combination of soft-knee compression and harmonic generation - the soft-knee compression controls dynamics from the top down, and the harmonics amplify quieter parts of the signal and increase the ratio of musical or in-key frequencies to out-of-key frequencies.
I’ll use this Saturn 2 plugin, but a great free alternative is GSat+ by TB Pro Audio.
With most saturation plugins you won’t have much control over the top-down compression - this will occur behind the scenes, with different algorithms resulting in various thresholds, knees, ratios, and so on. That said, you’ll need to listen carefully to what sounds best.
The harmonics are a little easier to hear and control - for example, even order harmonics will result in a warmer sound due to a 2nd order harmonics that will amplify the lows. Odd harmonics will emphasize the mid frequencies a bit better due to the 3rd and 5th-order harmonics being higher in frequency.
With the GSat+ you can introduce a combination of odd and even to achieve the sound you want, as well as pick a warm algorithm to emphasize lows, or a crisp algorithm to bring the mids and high mids forward.
What you pick depends heavily on the vocal and the sound you want - so, flip through various settings until it sounds the way you want. Be sure to do this in the context of the full mix to understand how the added harmonics and mild compression interact with the complementary and competing signals of your mix.
Let’s take a listen to saturation, as I vary the settings from warm to cleaner sounds.
At this point you’re probably wondering while we’re still trying to control dynamics - between the last 3 processors, we’ve done a good amount of it. That said, each processor we’ve introduced controls dynamics in a slightly different way.
The next one we’ll use brings up quieter details without affecting the peaks - I like to use this after saturation to amplify the quieter harmonics that I just added. This makes the saturation more noticeable without having to drive the saturator too hard and introduce a noticeable breakup.
I’ll use this MV2 but OTT is a good free alternative is you use it very subtly - to use it for upward processing you’ll need to lower the depth amount, turn top-down attenuation off, and drag the sliders in the middle to the right.
Using this MV2 plugin and others like it is simple - it usually includes one function to adjust. Just drag or adjust it until you have the sound you want. Like peak-down compression, too much isn’t a good thing so use your ears and listen critically to it in the context of the mix.
Let's take a listen to the plugin being used, and notice how quieter details are brought forward. Also, notice how the effect of the previous saturation becomes more apparent.
As I said, introducing this type of processing isn’t needed, but I do like how it quickly balances the vocal in a way that I couldn’t do with other processors. In short, I’ll use this resonance reducer to dynamically attenuate various frequencies that are causing the vocal to sound unbalanced.
This Soothe 2 plugin is one of the few plugins that do this, and I’ve found it’s the best option, but Smooth Operator by Baby Audio is a more affordable option.
By adjusting the pre-emphasis EQ bands, I can control how sensitive each range is to attenuation - so if the vocal is still sounding a little muddy, I would emphasize the low mids. Each vocal is different so how you adjust the bands will depend, but usually a band in the low mids, one near nasal tones, and one to help with the vocal’s sibilance is a good starting point.
Additionally, I like to add this processor after some of my additive processors like the compressor, saturator, and so on, so that it balances what they’ve amplified.
Let’s take a listen to the plugin and notice how it adds some needed control to the frequency spectrum.
With most modern vocals, the sound is pretty dry. The vocalist typically sings close into a directional microphone, and typically in a studio that’s smaller with a lot of padding, resulting in a very dry sound.
This isn’t a bad thing, but it helps to add some subtle room reflections to create a sense of the vocalist being in a realistic space. For that reason, I’ll use a studio room emulation reverb to simulate how it would sound if the vocalist recorded in a studio with some wanted reflections.
I like to put this later in the chain after I’ve balanced things with some of the previous processors and created a dense sound. This way the reflections add to the musicality of the vocal, and don’t amplify unwanted aspects of the vocal.
I’ll use this Seventh Heaven plugin, but a stock reverb should work well - whichever one you like that allows for short reflections or a room emulation.
Then mix in the effect to a low level - something that’s just barely audible but there nonetheless.
Let’s take a listen to the reverb being introduced - it’ll be subtle for sure, but notice how it adds a realistic element to the performance.
At this point, the vocal is balanced, full, and has a natural sound due to the subtle room reflections we just added. Let’s start adding some creative processing to make it more interesting.
I’ll create a send from the vocal and on the corresponding auxiliary track introduce a delay plugin. I’ll just use this stock delay plugin, and create a 1/8th note delay on the left channel and a dotted 1/8th note delay on the right channel, then increase the amount of the delay to 100%.
Since we’re using an auxiliary track, I can lower the amount of the effect by adjusting the auxiliary channel’s fader. This will make automation easier, but we’ll cover that in a moment.
Next, I’ll insert a compressor after the delay on this track - any compressor will do, but make sure it’s capable of external side-chaining. Then, I’ll side-chain the original vocal track. I’ll use similar settings to the compressor I used on the original vocal to keep things simple, but I’ll turn off auto-make-up gain.
What we’ve essentially done is compress the delayed signal whenever the dry vocal is loud enough. As a result, the delay is attenuated whenever the vocal begins, letting some of the dry sounds through before the delay kicks in.
This helps keep the effect from being too aggressive and lets the listener hear the detail of the vocal and more subconsciously hear the delay.
Let’s take a listen to this effect being blended in and notice how the delay isn’t too noticeable, yet the effect is still adding to the vocal.
We’ll do the same thing we did last chapter, that is create a send and aux track, but this time instead of delay we’ll use a reverb plugin. This reverb setting will typically be longer than what we used for the realistic room emulation that we introduced earlier.
Whatever setting you choose is completely up to you and depends on what you’re trying to achieve - I’ll use this Oxford reverb plugin and create some dense early reflections with a longer decay.
Then, like before, we’ll introduce a compressor that’s triggered by the dry vocal via the external side chain. This way the beginning of the reverb is attenuated whenever the vocal is sung. We could use the same settings as before, or adjust them slightly to have the reverb come in earlier or later - again depending on the sound you’re trying to achieve,
Lastly, we’ll blend the effect in with the aux track’s channel fader.
Let's take a listen to this being blended in and notice how the reverb is present, but isn’t so noticeable that it’s distracting or unnatural sounding.
Last up in terms of processing, I want to introduce an EQ that can control the totality of the vocal’s sound - one that adjusts the frequency response of both the original vocal and the sends we created.
I’ll change the output of the original vocal track to a bus - then, I’ll send the auxiliary tracks to this same bus. This way, all processing we’ve added to the vocal is being sent to the same track, on which I can insert an EQ that shapes everything.
I’ll have to listen closely to the vocal in the context of the mix and adjust accordingly. I’ll add some clarity to the vocal around 2.5kHz, as well as dip some lows that we’re added with the processing.
Then, I’ll add a little air to the highs. What’s great about having an EQ at the end of the chain is that I can adjust the mid and side - the side image was created from our temporal processing sends, so now we can boost some of the highs on the side image to brighten those, or attenuate the reverb and delay around the vocal’s clarity range.
Again, what you do is up to you, but this is a good starting point.
Let’s take a listen to how this EQ shapes the vocal, and notice how it offers a lot of control over the final sound.
At this point, you’ve probably noticed that I haven’t done the best job with gain-staging - some of the processing has increased the vocal’s loudness and amplitude. I could go back and alter the gain-staging between my plugins, but I’ve found the easiest and least time-consuming method to be adjusting the final bus’s channel fader.
Since the plugins have 64-bit processing, I don’t have to worry about hitting a ceiling between the plugins, so adjusting the fader does essentially the same thing as gain-staging. I don’t think we need to listen to this step, so let’s move on to the last chapter of the video.
What’s great about having our vocal’s temporal effects on aux tracks is that it becomes easier to adjust their level with automation. So say that there’s a particular passage that could benefit from more reverb, we could simply automate the reverb’s aux track channel fader.
The same could be said about the delay. Additionally, say I want to cut out the fundamental frequency for the full vocal or maybe do a frequency sweep to everything at once - I could automate a high or low-pass filter on the bus’s EQ at any point of the song.
Like most things when mixing, there are no exact answers, so create automation to your effects or maybe the vocal’s overall level where you see fit. I find that sections like the chorus or a part that includes more instrumentation are good points to adjust the level of these effects.
Let’s take a listen to the finalized mixed vocal with automation included, and notice how it sounds balanced, full, and has an overall polished and enjoyable sound.