Are A.I. Vocals the Future?

Taking a Look at the Session

So we have a simple track, there’s some bass, drums, guitar, piano and of course a vocal.

To generate the various singers, I got a pro subscription to Lalas so that I could test out a lot of the different options they have. Now, this video isn’t sponsored or anything, so I’ll give you my actual thoughts, again, while doing my best not to be too opinionated.

As you can see there are literally hundreds of different voices to emulate, from popular singers, to cartoon characters.

So in this video, we’re testing out Drake, Ed Sheeran, Arianna Grande, Micheal Buble, Kanye, Michael Jackson, Beyonce pitched an octave up, Billie Eilish, and Ellie Goulding pitched an octave up.

Then, for my personal amusement, we have Hank Hill, Rick Sanchez, and, of course, Morgan Freeman, pitch-shifted an octave down for good measure.

Real quick let’s listen to the demo with the original vocals.

Watch the video to learn more >

Creating AI Vocals

So, with a tuned bounced-out version of the lead vocals, I’ll upload that to the site, determine if I want to pitch shift the vocals, and then download it when it’s ready.

Right off the bat, I’m noticing an issue - my voice is kind of raspy, and the AI really didn’t like that.

Watch the video to learn more >

You’ll notice this kind of squeaking sound since I don’t think the AI knows what to do if there’s any yelling or more aggressive aspects in the vocal. Also, I had these vocals compressed with the channel strip, but Lalas adds a lot of compression and maximization to the vocal, so it’s best to avoid any processing other than tuning before importing a track.

Another issue I noticed is with Oohs. For example, I did some falsetto - the original vocals definitely aren’t great, but I feel like AI Ed Sheeran didn’t help me out here.

Watch the video to learn more >

So, I redid the vocals and made them as smooth as possible - I didn’t add any compression when recording, just tuned them after the fact, and this gave me much better results.

Here’s Drake, Ed Sheeran, and Arianna Grande.

Watch the video to learn more >

So, as you can hear, it’s better, but you’ll still notice a lot of artifacts if it’s soloed. Those become less apparent when played in the context of the mix, but still, these artifacts will only get louder during mixing and mastering.

Also, notice how the dynamics are affected through the AI emulation - the original vocal is pretty tame and has dynamics, while the affected ones have a lot more compression.

Okay, let’s check out Michael Buble, Kanye, and MJ.

Watch the video to learn more >

Kanye’s definitely surprised me here, it actually sounded like Kanye - MJ not so much, so I think I’m learning that these are kind of hit or miss. I imagine the original vocal’s styling has a big impact on how similar it sounds to the artist you’re trying to emulate.

For example, at one point, I tried Cardi B and it wasn’t bad, but it just didn’t sound like her at all. So I think if you use AI Vocals, keep in mind your performance has to be at least somewhat similar to the artist for the effect to work well.

Moving on, I was curious how the pitch shifting sounded, and how well it could take a male vocal and make it sound as if it was sung by a female vocalist.

So let’s listen to Beyonce, Billie Eilish, and Ellie Goulding.

Watch the video to learn more >

Beyonce sounded impressive to me, I guess even AI Beyonce is in a league all her own. The other 2 sounded fine - I’m noticing a good deal of variation between the intensity of artifacts, though.

Billie Eilish’s vocals definitely have a lot of noise and weird distortions - it’s easier to notice when they’re soled (demo).

So, I guess how useful this is depends on your tolerance for those types of things.

Okay, as promised, here’s Hank Hill, Rick Sanchez, and our boy Morgan Freeman.

Watch the video to learn more >

I don’t know about you, but hearing Hank Hill singing made my day. I mean, it’s so absurd that it’s hard not to have fun with this type of thing.

So, if you’re using a platform like this just to be entertained, I’d say it’s 10/10; please take my money. But let’s see if we can make the AI singers sound a little better with some additional processing.

Improving AI Vocals with Processing

I’m going to use Kanye’s vocal, and start with a Gate to remove some of the artifacts or at least make them less noticeable. Then, I’ll add some EQ to cut out any unneeded lows below the vocal’s fundamental and make some general adjustments to make it sound more natural.

Now, you probably noticed how sibilant his and a lot of the other vocals were, so I dipped some of the highs, as well as added a de-esser to balance that region out. A little saturation gave the vocal some needed complexity, and then temporal effects like ducked reverb that emulated a studio and a chorus/doubler effect gave it some needed polish.

Notice that I avoided compression - if you use this service or one like it, the AI vocals already seem super compressed, so I’d definitely recommend you skip any additional compression.

Here’s Kanye’s vocal with the effects, and for fun, I processed Beyonce’s so we could hear those 2 together.

Watch the video to learn more >

Final Thoughts on AI Vocals

I’m kind of mixed if this is the future of music, or how much of a role it’ll play. Something I haven’t touched on is the legality of a service like this - I imagine that at any given point, this company has 100 cease and desist letters in their mailbox.

I notice they use AI-generated images for each artist and say “Inspired” under each artist. So, this isn’t Drake; it’s Drake-Inspired. It’s not Ed Sheeran; it’s Ed Sheeran-Inspired, and so on.

But I can’t imagine that something like this will last for that long with so many potential copyright infringements or some violation of using an artist’s likeness to sell a product. But I don’t know, I’m certainly no expert about that type of stuff.

As for the results, I was pleasantly surprised by some of it. But you really have to be careful about how you perform, and how you process a vocal beforehand to get usable results - and even then the results are a mixed bag with a good deal of artifacts and some singers simply not sounding right. I imagine this will get better as the AI improves, but who knows.

What I want to know, though, are your thoughts - which vocalist was your favorite? Do you think AI Vocals or AI in general have a place in music? And did I make a mistake by not creating a choir of Hank Hills?