Arrangement and Music Mixing: Understanding the Relationship

What’s Included in a Recording

It helps to know exactly what we’re mixing.

Each instrument has both transient and tonal elements. If you’ve heard of the ADSR or attack decay sustain release, it’s a simplified version of that.

The A and D make up the transient, and the S+R make up the tonal.

Let’s start with the tonal.

When a note is played on an instrument like a piano, guitar, synth, or its sung, it includes a lot of tonal information.

For example, if a bass guitar plays A3 or 220Hz, what we hear as the actual note, or the frequency specific signal, are the tonal elements.

If we observe the signal, we’ll see 220Hz, or A3. Then, we’ll observe overtones or harmonics. These are whole number multiples of the foundational frequency or the fundamental.

They could be 330Hz. 440Hz. 550Hz. And so on.

As the note is held out, and we observe the sustain and release, notice that these harmonics generally decline in amplitude the further away they are from the fundamental.

So, even though this bass note occupies the mids and even the high mids, it’s primarily a low and low mid-frequency signal. Its highest amplitude elements are in the lows and low mids.

This is really what defines how busy a frequency range is.

Additionally, various instruments will have distinct overtones and unique fundamental amplitudes. For example, a piano’s fundamental is often lower in amplitude than its overtones.

So if I played A3 on the piano, it would occupy the high mids and highs more than if I played A3 on the bass.

The arrangement then, that is, the notes played and the instruments chosen to play them throughout a song’s duration, is really what makes up the balance of a mix.

Generally speaking, a frequency response similar to pink noise mirrors the amplitudes needed for us to hear everything as equal in amplitude.

So a general decrease in amplitude from the lows to the highs is what you’ll notice in most mixes you analyze.

If the instruments are carefully chosen for their unique overtones and timbres, and the notes are carefully arranged, a mix should already be established before we begin mixing.

We’ll likely still need to control aspects with various processors; however, the frequency response will already be balanced more than not.

If we were to layer instruments in a way with too much overlap, for example, if I had the kick and bass both play the same note or occupy the same frequency, thats when we achieve a muddy or unbalanced sound.

If the arrangement is conflicting, then the mix will conflict.

The Transients

Transients, or the attack and decay of the instrument adhere to specific frequencies less often - in other words, if I hit a snare, the percussive or transient aspects aren’t necessarily a note.

The fundamental could be tuned to a note; however, the majority of overtones aren’t whole number multiples of the fundamental.

Transient elements exist in most tonal instruments as well. For example, pianos, guitars, vocals, and just about any instrument we associate with notes also have a hit or transient to them - the exception being a soft synth pad or instrument in which the envelope was designed to have a very slow attack and decay,

The percussive aspects of these instruments occupy a lot of the high mids and the highs.

For example, a vocal’s consonants and sibilants, or the short percussive aspects of the language, occupy the high mids and highs.

The scrap of a guitar string is in the high mids. The snap of a snare or kick is in the mids to high mids, and so on.

Since we’re more sensitive to these high frequencies, this works out to our benefit - having the highs dynamic allows for the overall frequency balance we’re accustomed to, while giving the ear some reprieve.

When the highs are too high in amplitude, such as when vocal sibilance is uncontrolled and aggressive, it ranges from mildly annoying to actually painful. So this dynamic relationship between tonal and transient is important.

The transients allow for high frequencies to be filled, while ideally, retaining a comfortable listening experience.

This isn’t to say that tonal elements don’t exist in this high range, just that they’re often too low in amplitude for what we perceive to be a balanced frequency response. The addition of high-frequency transient information is what establishes that balance.

Real quick before we get into the next section of the video, Sage Audio is an Analog mastering studio - we’ve been around for 20 years, and offer our services for $49 per track. There’s an ad at the very end of the video if you’re interested in learning more.

How Arrangement Affects Mixing

The instruments present in the recording will inevitably affect the decisions you make, by altering what does and doesn’t sound balanced.

For example, say you have an acoustic guitar playing, and the fundamental is in the low mids.

If the entire arrangement was just a acoustic guitar and vocal, then the acoustic guitar could occupy the low mids. There isn’t as much present in the range due to the simple arrangement.

But, if we were to add bass, kick, snare, and other instruments that occupy the low mids, then we’d have to reconsider how the acoustic was mixed.

Since these other instruments are now occupying the low mids, it’s very likely we’ll achieve a muddy sound if we don’t attenuate some of the acoustic’s fundamental and low order harmonics.

If we were the ones doing the arrangement, we may even decide to play the acoustic guitar one octave higher to avoid this issue.

So, real quick let’s listen to an acoustic with higher energy in the low mids.Notice that the response sounds more or less balanced. But, when I introduce other instrumentation with fundamentals or overtones in the low mids, the overall mix quickly becomes unbalanced.

Watch the video to learn more >

Next, say the arrangement includes a synth with a lot of overtones between 2-5kHz. These overtones mask or cover up the vocal’s 3rd formant, or cluster of vowel and consonant info, making it harder to hear the vocal.

We may dip the range on the synth while boosting the range on the vocal. Or, again if we’re in charge of the arrangement, we may pick a synth without these overtones, or one with less presence in this range.

Next, let's say that the overall mix is lacking in the mids. Maybe the arrangement didn’t account for this range as much as needed.

If we were to saturate an instrument with a fundamental or fundamentals in the low mids, then the harmonics generated would form in the mids - in turn, amplifying this area and filling frequencies that may not be occupied.

Alternatively or maybe in addition to the saturation, say we reverberate the vocal’s mids. The reflections would serve a similar purpose. They’d amplify the range, a good amount with the early reflections and then less as time goes on. If subtle modulation was introduced, they’d occupy frequencies that weren’t occupied with the original arrangement.

It really comes back to filling the frequency response in an intentional way. If a range is already at an amplitude that’s enjoyable and balanced, then introducing processing that amplifies it doesn’t always make sense.

There’s an exception to this though. Say we wanted to introduce a creative effect, for example, delay with unique behavior, and it also amplifies an already balanced range, then we’d need to attenuate the range either before or after introducing the creative effect to retain that balance.

Arrangement and Dynamics

We’ve touched on this a bit when we discussed transients, but let’s cover it in more depth.

The RMS level of a mix, or its average loudness is contrasted with the peak level. This isn’t always the case, but more often than not, the transient creates the peak level while the tonal elements create the RMS.

It feels obvious to say, but by being quick in nature and high in amplitude, transients will often be what you’re measuring with a peak meter.

Meanwhile, the tonal elements exist at a moderate amplitude but for a longer period of time, in turn, contributing more to the average loudness.

If the performers are particularly talented and in control of their instruments, and micing and gain staging are done properly, then a lot of this dynamic relationship can be achieved during recording.

But when these relationships are off - then compression and other dynamics processing come into play.

For most recordings prior to mixing, the peak level is too high, and the RMS is too low. This causes temporal masking, in which the high amplitude peak covers up what comes directly after it.

Through dynamics processing, the entire goal is to bring the RMS up to counteract this masking.

For example, if I compress an acoustic guitar with peak compression, then I’m often attenuating the transients. With the additional headroom provided by attenuating peaks, I can now amplify the overall signal, bringing up the tonal elements and increasing the RMS.

Genre plays a big role in this as well. For example, if you’re mixing jazz, you’d likely make the musicians frustrated if you altered the dynamics too much. The dynamics established during the performance are likely very intentional.

For other genres like rock, fewer instruments are typically responsible for the variation between RMS and peak levels. Usually, the kick and snare create the peak level, while the other instrumentation is responsible for establishing the RMS.

So the vocal, bass, guitar, and so on can be compressed more, while the kick and snare are compressed, but in a manner that retains dynamics.

So, I hope this helped explain the relationship between arrangement and mixing; I’m of course barely scratching the surface, and the recording process, which I didn’t touch on too much, plays a huge role.