Sage Audio

Vocals are arguably the most difficult and important aspect of a mix to get right - so I’m going to share what I’ve learned over the past 10 years and distill it down into 5 Tips for Better Vocals.

Understanding Standard Routing

I’m not sure if this concept has a name so I just call it Standard routing. It’s a very logical progression of inserts starting with Subtractive and corrective EQ, followed by de-essing, compression, tuning if needed, saturation, temporal processing, then one last EQ.

The first EQ balances out the sound by attenuating any frequencies and ranges that are too aggressive. De-essing balances the sibilance, which then feeds the signal into our first additive processor - compression.

The compressor receives a cleaner signal, meaning that when we introduce post-compression auto-make-up gain, we amplify a balanced signal.

Then, the tuner has an upfront signal to tune with all of the quieter details emphasized to help it determine the pitch more accurately.

Saturation can be used to give the vocal a unique timbre and further control dynamics. Temporal processing like delay and/or reverb gives the vocal a sense of depth while introducing a stylistic element, while the final EQ lets you have the final say over the frequency response.

This is probably a chain I use 70% of the time since it just follows such a consistent structure that makes problems easy to identify and fix. I’ll use free and stock plugins so that you can emulate the chain if you’d like, but be sure to tailor the settings to your specific performance.

Let’s listen to a before and after.

Watch the video to learn more >

Use Multiple Tuning Methods

Most people stick to one vocal tuning method when tuning a vocal, but like compressors, saturators, or any other insert, it helps to try a couple of options and see what works best.

For example, when tuning, try both the traditional auto-tune style tuner and the melody note correction style and see which one creates the sound you’re going for.

Personally, I like to tune with EQ, saturation, and chorus effects. With an EQ I can emphasize in-key notes by amplifying. Then, with a saturator, I can create in-key even-order harmonics with most warm tube settings. Then, with a chorus effect or a harmony generator, I can create mild to moderate delay and detuning to multiple vocal taps to create a generalized pitch that’s closer to in tune than the original vocal’s singular signal.

It may not sound as in tune as a tuner does, but it definitely sounds more natural.

Let’s listen to the 3 different methods - I’ll have to use paid plugins for this demo, but there are some free tuners out there like MAutoPitch.

Watch the video to learn more >

Utilize in-time processing

Almost every type of processor can be made in time with the session or BPM if a set tempo is used.

Although the math is the same for each processor, there are some small changes to make between timing reverb, compression, delay, and modulated saturation.

The equation is just 60000/BPM. So, if the session is 120 BPM, I’d divide 60000/120. The result is one-quarter note in milliseconds. In this instance, 500ms or 0.5 seconds is a one-quarter note.

Timing delay is the most straightforward - it’ll sound in time, and most delay plugins include tempo-synced settings to keep it easy.

For reverb, the math offers a good starting point, but then you’ll need to adjust. For example, say I want a whole note decay with my reverb. Again, if the session is 120 BPM, then 500ms is a one-quarter note, and 2000ms or two seconds would be a whole note.

I’d start with 2 seconds, but then lengthen it within the context of the mix. Since the rest of the instrumentation will mask the reflections, especially near the tail end of the reverb, it’ll need to be increased to still be perceived as a whole note.

For compression, again, the same equation applies, but the attack time needs to be accounted for and subtracted from the release time.

For example, if I want 1/8th note attenuation, which is 250ms, and I want an attack of 10ms, the release would be set to 240ms to keep it in time. Additionally, the threshold, knee, ratio, and other settings would need to be accounted for to ensure attenuation occurs in time with the song.

Lastly, for modulated saturation, which isn’t too common, you’d use an envelope follower and, like compression, time the attack and release to equal the time of the note you want.

Let’s listen to the standard chain, but with each form of processing timed to the track’s BPM. It may be a preference thing, but I think the performance sounds more musical and fits in a bit better with the surrounding instrumentation.

Watch the video to learn more >

Understanding the Fletcher-Munson Curve

I’ve mentioned this before, but I’m going to combine a few important concepts here and create a new graphic to connect the ideas.

In short, we’re most sensitive to the range between 2 and 5kHz. The Fletcher Munson curve shows this, based on multiple tests over the decades, but the way it’s shown is kind of confusing.

Inverting it makes more sense, with the x-axis representing frequency and the y-axis representing our sensitivity.

Then, we have to understand masking - although more research is needed, 300Hz was determined in the 1950s to be the most dominant masking frequency of higher ranges. This red translucent overlay represents the frequencies that are affected by 300Hz.

Any form of processing, be it EQ, compression, saturation, reverb, delay, tuning, and so on, will affect the frequency response. If your saturator increases 3.5kHz by introducing harmonics in the area, it will make the vocal easier to hear by increasing the range to which we’re most sensitive.

If your processing amplifies 300Hz. It’ll make the vocal harder to hear since those frequencies will mask 3.5kHz.

Although all frequency ranges are important, I always pay particular attenuation to the low mids and the high mids since the relationship between them has a huge impact on the clarity and intelligibility of the vocals.

Let’s listen to processing that improves the balance between these ranges, and processing that makes the ranges unbalanced.

Watch the video to learn more >

Don’t Overuse Resonance Processors

Resonance processors like Soothe 2 are relatively new and are definitely being overused. One of the more questionable popular uses for this plugin is placing Soothe 2 on the instrumental, with the vocal side chained.

The idea is to attenuate space for the vocal by finding resonances where the 2 tracks overlap. This is lazy at best and detrimental to the quality of your mix at worst. Don’t do it unless you’re making a demo and just want to alter some things quickly.

Additionally, if you have to use more than 6dB of overall attenuation on a vocal using these types of plugins, there’s something wrong with the vocal. Use EQ, compression, and other forms of processing to adjust the frequency response while using this plugin as a guide.

Also, know that these types of plugins may not be suited for your genre or vocal style - for example, if you purposefully have a heavily distorted vocal, resonance processors will drastically affect the timbre in a way that doesn’t augment the vocal.

These are very well-designed plugins, especially Soothe 2. Just be sure you don’t use them when they’re not needed or on every track—they’ll all start sounding too similar.

Let’s listen to Soothe 2 being used unnecessarily on a vocal and notice how it doesn’t augment the track in any way.

Watch the video to learn more >