Match pitch and feel of two speech audio files of the same person

Question

I recorded a video project (woman speaking on camera) and we then realized there was a section missing, some audio content we needed to insert. So we recorded the additional audio content and I'm inserting this audio into the original audio. It's the same person, speaking, but the pitch of the original recording is just a bit above the newly recorded audio and the "feel" isn't the same (different mics, different room, etc.). I'm wondering if Audition has a way of helping me to do some digital magic to match these two audio recordings to match them and make them sound like the whole thing is just one long take. Does anyone know how I might do this?

Before you say it, of course we could just re-shoot the entire thing but this would include re-renting the cameras, setting up the shot, paying for the person to come back in, etc. etc. Before we go that far I thought I'd ask the question. And in case you're wondering, the video shows the person talking, then the shot goes off to show other images while the voice continues over the images. It's in this "other images" section where we're inserting the new audio file. After this, the scene goes back to the person speaking and she finishes saying what she's saying.

Any help??? I'm guessing I can use some EQ adjustments to "match" the feel as best I can, lay in some noise to "feather" it, and possibly bend down the high video's audio and bend up the new audio's recording with pitch bending to match the pitches but I'm not sure if this is the best way about going at this. Is there an easier or more accurate way?

Thank you.

SteveG_AudioMasters_ · Answer

havenc63783602 wrote:
I'm wondering if Audition has a way of helping me to do some digital magic to match these two audio recordings to match them and make them sound like the whole thing is just one long take. Does anyone know how I might do this?

I'm afraid that this is a very common request, and that essentially, the answer is no. Even with voice-overs recorded before and after a lunch break using identical equipment, people can hear a difference and even that can't be fixed. The problem here is that human hearing is very sensitive to even the slightest change of inflexion, etc in voices, especially if you run two phrases, recorded at different times and places, together.

But it's that last bit that gives you some potential wriggle room. There are a few things you can do which will inevitably be cheaper than a reshoot, and relieve the immediate texture jump at the transition. The first is to impose a bit of narrative from a different voice - even if on the face of it it isn't necessary at that point. The second is to impose a musical or sound effect transition, with a slight break in the narrative. If you really think that you can't do either of those, then consider running either as an underscore - this will soften the effect on the ears, but generally I don't think that this is enough.

I suppose in a way that is 'digital magic'. Well it is in terms of how easy it was to do the same thing when I was a video editor, which involved a lot of tape, a huge pile of machinery and things that are now lost in the mists of time - like pre-rolls on every single edit...

Sign up

To post, reply, or follow discussions, please sign in with your Adobe ID.

Sign in to Adobe Community

To post, reply, or follow discussions, please sign in with your Adobe ID.

Scanning file for viruses.

This file cannot be downloaded