• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
2

Multitrack/Speaker Caption editing

Community Beginner ,
Sep 27, 2023 Sep 27, 2023

Copy link to clipboard

Copied

Hey team!

 

Love all the improvements to the text based editing that have been added. In the world of tiktok, instagram, and youtube shorts and social media in general there's been a surge in content that has burnt in captions. I know you guys care about social media editing because you introduced a vertical profile. A lot of social content comes from longer format podcasts. I personally edit podcasts with 3+ speakers but when it comes to captioning clips for multiple speakers it's kind of a headache, sure premiere can differentiate the speakers when you transcribe but the moment you convert to captions the text kind of gets jumbled up together and it's a lot of correcting who's speaking. 

 

With text based editing came the feature to transcribe the source clip which I wish could be used to transcribe and create captions for every track/speaker on separate caption tracks which could then be converted to graphics layers.

 

The current workaround I've found is retranscribing the the sequence and selecting the idividual audio track instead of mix -> creating captions ->repeat for track 2 -> track 3 etc and i end up with a time line like this 

Screenshot 2023-09-27 at 3.49.05 AM.pngScreenshot 2023-09-27 at 3.50.48 AM.png

which then I clean up and correct the captions and convert to graphic layers to be able to do animations and end up with a final timeline looking like this

Screenshot 2023-09-27 at 3.54.49 AM.png

the area where the captions overlap i have a splitscreen that each speaker has the text under

 

Screenshot 2023-09-27 at 4.03.31 AM.png 

 

So to sum it up I think it would be great if we can just get caption tracks for individual speakers.

 

And on the topic of captions it would be great to have a the option to set a keyboard short cut exactly like Edit selected caption text that would work on a graphic layer so there's less moving of the mouse. 

 

would love to hear your thoughts on this!

thanks you

Oscar Alva

 

 

 

Idea No status
TOPICS
Editing and playback , Graphics , User experience or interface

Views

84

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
3 Comments
Participant ,
Oct 16, 2023 Oct 16, 2023

Copy link to clipboard

Copied

It would be so useful to have caption tracks for individual speakers. Here's my use case... I need to transcribe and edit a Zoom call recording that has 11 different speakers. If I just transcribe the original audio, Premiere Pro freaks out and can't identify all the speakers accurately. It only seems to be made to deal with maybe 3-4 different speakers. It identifies about 5 of them and labels them mostly wrong. 

Fortunately, I had Zoom record separate audio for each speaker. I created a sequence with the Zoom video and stacked all the audio tracks under the video. Now each speaker has its own track. I got Premiere Pro to transcribe each track and named each speaker in their source clip transcript.

 

multi.JPG

 

However, when I click over to see the sequence transcript, it only shows the transcript of the first track. Whenever someone else is speaking, there is just a giant pause in the transcript. 

 

In my opinion, the transcript for the sequence should merge all the transcripts from each audio track and give you one master transcript with the correct speaker names beside the text. This would be great for dealing with interviews where the audio is recorded with multiple mics and audio recorded separately from the video. 

 

Anyway, maybe there's already a way to have a master transcript that combines the transcripts from multiple clips in a sequence, but I have not found it. 

Votes

Translate

Translate

Report

Report
Community Expert ,
Oct 17, 2023 Oct 17, 2023

Copy link to clipboard

Copied

@OscarAlva @superkevkt,

 

A few thoughts.... The programmers are actively adding functions that will make what you want possible. But the devil is in the details, and actually getting to your end goal is challenging.

 

Multi-channel support.

 

Release version 24.0.0 adds transcription for multiple audio channels in a single file - but only for mono channels. Stereo is already in the Beta version. Each file only gets one transcript, but the easy workaround is "breakout to mono," and each of those separate items can hold its own transcript. Active work on these features continues in the Beta version.

 

> it only shows the transcript of the first track. 

True. Muting a track will allow the next transcript to show etc. Solo does not have this effect. Removing part of an audio track also allows the next track transcript to show through. Yes, a text-based editing workflow is challenging.

 

But these new functions, and the other workarounds you describe, only get you to the stage of having separate transcripts for each speaker or a single transcript that has speakers labeled. On the timeline, they are just about unusable. For the next step, I imagine a multicam-type process as the way to edit. How would that work? You could have multiple text panel windows that show your speakers. Or ....? What is practical?

 

Multi-speaker support.

 

It would help just to be able to create separate caption tracks based on speaker identification in the transcript. Also, provide an option to put the speaker name at the beginning of caption text. It could be a setting in the EGP that could be part of a style.

 

Animation style captions.

 

Upgrade caption to graphic was a huge step, but it is still challenging to find a template or automated way to accomplish this.

 

@TeresaDemel @Francis-Crossman Any updates about future directions? Very exciting times!

 

Stan

 

 

 

Votes

Translate

Translate

Report

Report
Participant ,
Oct 17, 2023 Oct 17, 2023

Copy link to clipboard

Copied

LATEST

I managed to get what I wanted, which was a single transcript with all 11 speakers, pulled from multiple audio tracks. What I did was:

 

  1. Create a sequence with all the audio tracks synchronized and stacked under the video
  2. Transcribed all the audio tracks individually and set the speaker name for that track in each transcript.
  3. Used the Razor tool to snip the pieces that had speech from the lower tracks and dragged those pieces up to track 1. It was always fine to overwrite what was there because you don't want more than one person speaking anyway.
  4.  After having done this, you now have a transcript with multiple speakers on the sequence!

 

Make sure to preserve an intact audio file with all speakers if you want to keep a natural-sounding audio track with people speaking over each other. You can mute all the individual audio tracks and just use your original audio track when you encode the video.

Votes

Translate

Translate

Report

Report