When working with a harsh vocal, you can typically tame it by attenuating frequencies above 2kHz and below 12kHz using a subtle bell filter via an EQ. Attenuating 3-5kHz will help significantly, as will using a de-esser on 5kHz to 12kHz to reduce harsh-sounding sibilance.
What harsh frequencies are is a little subjective, but most people find 2kHz to 5kHz to be the more abrasive. If we look at the fletcher Munson curve we can see that we’re most sensitive to this region of frequencies - but to simplify this, here’s an inverse of this graph.
We’ll notice that the areas with the red line are where our ear is most sensitive, while blue represents where we’re least sensitive. Notice that while sub and air frequencies are hard to hear, 3-5kHz give or take a few hundred Hertz, is what sounds loudest to most people.
That said sibilance, which is directly above this range is also often described as sounding harsh.
With that in mind, let’s take a listen to our unprocessed vocal, and consider how these frequencies will come into play as we try to tame it later on.
The easiest way to reduce harsh frequencies is with an EQ - all we need to do is reduce these areas and the vocal sound becomes less abrasive. That said, 3-5kHz also helps the vocal cut through a mix, so if we reduce it too much, it’ll get buried.
So let’s listen to the vocal with 3-5kHz attenuated a few dB, and notice if it sounds less aggressive.
De-essers are frequency-specific compressors that attenuate the frequency ranges we’ve been discussing in the last 2 chapters. Since they work just like a compressor, they’ll attenuate this range whenever the signal crosses the threshold, causing dynamic attenuation as opposed to an EQ’s static attenuation.
Let’s set the range for the frequencies between 4kHz and 10kHz and notice how we reduce harsh-sounding sibilants.
Although de-essers are very useful, they can have a sound to them so to speak - if we want to remove harsh sibilants in a more transparent way, we can use clip gain when editing. To find sibilance, listen to the vocal, or look for dense clusters of frequencies.
Since sibilance is higher in frequency, the waveform representation will show waves closer together, indicating more oscillations in a shorter amount of time.
We can then isolate the clips, and use clip gain to turn them down. Let’s take a listen to this editing being applied.
Masking occurs when lower, more powerful frequencies cover up higher ones - by looking at this graphic, we can see that 250Hz masks or partially covers up a lot of our higher, more harsh sounding frequencies. That said, we could subtly increase the amplitude of 250Hz, to reduce harshness.
Let’s try this in combination with reducing 3-5kHz, and notice how it tames the vocal.
If you want to add saturation to a vocal but don’t want to saturate your high frequencies, this can be a challenge without a multi-band saturator. One method to exclude your highs from saturation is to first set up a parallel track, and use a linear phase EQ.
With the EQ, use a low-pass filter to exclude your high frequencies - then add the saturator, and blend the saturation in using the channel fader.
Let’s take a listen and notice how the vocal sounds fuller, but not harsh.
When processing a signal in a digital system like a DAW, there’s a limit to how high our frequencies can go. Any frequencies that go above this limit, will be reflected down the frequency spectrum, adding harsh-sounding high-frequency distortion, often called fold-back or aliasing distortion.
In short, oversampling increases the range the signal can occupy and uses filters to reduce aliasing.
Although it’s best not to saturate a vocal’s highs when trying to reduce harshness, let’s listen to it with and without oversampling to see if we notice a difference.
If we’ve tried some of the other tips and they’re still not reducing harshness enough, we can try this unorthodox method. Using an EQ, let’s still reduce high frequencies around 3-5kHz, but additionally set the processing to the highest linear phase setting that’s available.
Without getting into lots of detail, this will delay the vocal by over a millisecond - your DAW will then try to compensate for that delay so that all instruments are in time.
As a result, mild phase cancellation occurs to the signal's quickest aspects or its transients, in turn reducing their impact and causing a smooth, less harsh sound.
Let’s take a listen to the EQ with and without Linear Phase enabled.
Another strange method would be to emulate the response of a classic capsule - for example, the K47 used in the U47 microphone. Although the response doesn’t exactly reduce harsh frequencies, its sound has become synonymous with a vintage, and smooth sound indicative of darker recordings.
If the capsule’s response does cause a harsher sound, simply reduce any filters amplifying those harsh areas.
Let’s take a listen to the capsule’s response being emulated.
Last up, I’m going to blend reverb in with my high frequencies, to reduce their presence and hopefully some harshness. First I’ll select a studio emulation - something with a short decay time, then I’ll isolate the reverb to just my harsh frequencies, again 3-5kHz, and maybe some sibilance.
With the wet/dry I’ll blend the effect in, in turn replacing my vocal’s high and harsh frequencies, with subdued reverb reflections.
Let’s take a listen to it.