Combine text in live transcript and better export control

Report · Apr 17, 2024

I am currently trying to export transcripts from Premiere Pro for a podcast episode monthly at this point.

There is still a lot of time spent after every time the CSV file is exported as that is the only way to take the text to quickly remove the timecode and combine speaker lines into paragraphs.

If only in the premiere, I could combine the text in the timeline transcript. Every time there isn't a time between audio clips and the premiere can recognise that as detecting where ripple deletes on a gap can be applied and see that as a reason to combine text back into a paragraph. Either at the premiere, that can be cleaned up, and the recombined speaker turns in paragraphs of what happens in the transcript export process. It would also be nice to export a doc file with the option of having paragraphs and turning off the timecode if need be.

Report · Apr 17, 2024

@Blue-Marble,

Upvoted. Yes, there is no automatic way to combine transcript segments by speaker. And there is no control over export format. The developers are working on bulk selection of speakers, and I hope to see more progress in this area.

Workaround: Export as CSV, and use Excel macro programming to remove timecode, combine segments by speaker, format speaker name at beginning of each speaker segment. Once your code is done, this will be easier.

Workaround for combining segments: Manually combine speaker transcript segments in PR. If it is a source media transcript, be sure you are in source view. Drag-Select the segments you want to merge, then merge. The transcript will export with the merged segments.

If you haven't used merge since release version 23.4, see this post for a discussion of the "new" merge behaviors:

https://community.adobe.com/t5/premiere-pro-bugs/merge-captions-suddenly-gray-and-unlcikable/idc-p/1...

Stan

Report · Apr 17, 2024

The sentence parts in the source, if merged into paragraphs, do not show up as combined in CSV according to my testing.

What does the macro code for this look like? Does it only work in Excel? Will Google Sheets or Libre Office also accept it?

Report · Apr 17, 2024

> The sentence parts in the source, if merged into paragraphs, do not show up as combined in CSV according to my testing.

It is merged in my test. The merged segments are grouped by a single speaker and timecode, then the merged text.

> What does the macro code for this look like? Does it only work in Excel? Will Google Sheets or Libre Office also accept it?

The code would have to be written. No, only Excel. I don't think any of the other options will run Excel macros (a version of VBA). And no, this is not for the faint of heart.

An alternative would be to use something like Notepad++ and program a keystroke macro there.

Stan

Report · Apr 18, 2024

Stan, if you know how something works. Please encourage everyone to learn. Feint of heart or not, getting better at this is important to me.

Barring needing to buy Excel just for this, do you have a link somewhere on how to write something simple in C++? It sounds worthwhile to use if it works every time, not like Premiere Pro, on the other hand.

Report · Apr 18, 2024

Johannes,

I hear you. I don't know C++ and don't have a link. I see no reason that a variety of approaches might not work. The task is to take the predictable PR format and keep some info (speaker name and transcript text) and remove other info (timecodes).

I would not buy Excel just for this purpose; it is just what I happened to have learned over various projects (urelated to video editing).

Let me clarify your goal. The exported transcript is to make it available as text for the podcast? Or to use it temporarily as part of the editing process?

Stan

Report · Apr 18, 2024

Yes, the task is to take the transcript and provide it as an easy-to-read document to podcast listeners.

Report · Apr 18, 2024

Yes, that is a great, practical use-case. And I think a workaround is possible.

If using a text editor to modify, the advantage to exporting the transcript as .txt is that each part to be kept/removed is on separate lines.

Manually merge transcript entries for the same speaker to get what you want. Or, for now, live with smaller segments/paragraphs and multiple speaker labels. I have some ideas on how to handle that, but not time to test.

So, for example, here's a sample:

00:00:01:07 - 00:01:18:19
Bill
Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. 

00:01:18:20 - 00:02:28:24
Mary
Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. Some text on one or more lines.

I am using my favorite (PC only?) - Notepad++

Ctrl+Home (Go to top)

Start recording macro

Shft+Down-arrow (selects timecode line)

Delete key (deletes timecode and line, First line is now "Bill", cursor is at beginning of line)

End key (cursor is right after "Bill" and before the line ending)

Type a colon and space (First line is now Bill: - Or use whatever formatting you want.)

Delete key (deletes line ending, First line is now "Bill: Some text ....")

Ctrl+F (Find)

Enter 00: (This works as long as the program is under one hour. A more flexible approach would need modification.)

Click "Find Next" (Cursor is at beginning of next timecode. May have 00: selected)

Home (moves cursor to beginning of line, without selection. We are now in the position in the next transcript segment where we started the macro in the first)

Stop recording macro

Save the macro

To process a .txt transcript, open it, Ctrl_home to be sure you are at the beginning.

Macro -> Run a macro multiple times

Pick your saved macro and check "run until end of file"

There is a flaw in the last entry, and possibly other problems that I have not tested. But this gives you an idea.

Let us know how it works (or doesn't).

Did you get the merge/export to work?

Stan

Report · Apr 21, 2024

Thank you for showing me. I will try that.

No, the export of the transcript from the timeline view still has separated sentence parts when I export.

First little bit:

Salma Uche Okeke
Our guest today is Selma

00:00:42:21 - 00:00:43:17
Salma Uche Okeke
Etareri

00:00:43:17 - 00:00:44:16
Salma Uche Okeke
Born in

00:00:44:16 - 00:00:46:12
Salma Uche Okeke
1968, in Bad

00:00:46:13 - 00:00:47:17
Salma Uche Okeke
Hofgstein

Report · Apr 22, 2024

@Blue-Marble,

Thanks for responding.

That is odd. To clarify, you are working with and exporting a Transcript, not Captions correct? Before exporting, are you seeing the text merged/combined?

Stan

Report · May 21, 2024

I haven't referred to Captions, no.

Before exporting, I see the text separated from the timeline edit.

Report · May 21, 2024

@Blue-Marble,

This is what I now suspect, and I do not see a good solution.

You can only merge in the source view. And that merged transcript is what you have when you switch to Sequence/timeline/Program Monitor view. BUT if you have edited the clip it breaks that merged transcript into new segments based on the edit. And the segment sizes you show, for example in your post with the "First little bit:," a regular transcript would never have segments this small for one speaker. Is that making any sense?

IF that is the issue, I would try this. BEFORE editing the sequence, in Source view, merge the whole transcript into one segment. (Select all of it and hit Merge.) Edit anything that involves removing large sections, but do not remove pauses or filler words etc. Export the sequence transcript. Now do small edits (remove pauses, filler words etc), and edit the exported transcript as needed.

Stan

Report · May 21, 2024

The example I showed is for an audio product. Adobe has not released transcription for Audition. This is real life, and this is happening. This is for a regular transcript. It doesn't make sense that you can't believe me that these small edits are being done for audio. I haven't found any offline product that can edit audio transcripts like Premiere Pro.

Now you are suggesting losing the live edit of the transcript and he audio and just forgetting about using Adobe for live editing. Basically I should forgo any sort of impovement from Adobe for this purpose making my workflow twice as slow.

Report · May 21, 2024

Sorry. I'm not suggesting abandoning anything. I'm just trying to understand your workflow and whether we can find a way to get a more readable transcript exported.

Stan

Report · May 21, 2024

I showed you what Premiere Pro exports from the edit script. The millions of edits I made to the audio are reflected in the script. Even if combined in the source as speaker turns, the exports still reflect the millions of edits I made for the audio. This is not good, and it needs to be recombined without timecode in a CSV and sent to a Word doc for natural reading. If Premiere would just automatically do that in a text file, my life would be much less stressful.

Report · May 22, 2024

Yes, back to the beginning of this feature request: I like the option and upvoted. And I did not understand part of your workflow and made a mess of the workaround.

I would try this. Once you are done with your edits, create a Static Transcription of your sequence. You CAN merge transcript segments of a static transcript. But you must manually select each speaker group as you go. Export that as .txt or .csv.

Stan

Report · May 22, 2024

I will try this and let you know.

Report · May 23, 2024

The only way to create a static transcript is to re-transcribe it. I don't want to lose all the sentence and spelling corrections I made. That doesn't seem like the way to do what I want.

Report · May 23, 2024

> The only way to create a static transcript is to re-transcribe it. I don't want to lose all the sentence and spelling corrections I made.

So that might work if you were doing this in a new project?

But try this method for your already edited transcriptions:

Copy the project file in case this is big mess.

With your source transcriptions edited, and cut up in the sequence, in sequence view, export transcript to .txt. We don't need them merged or anything. But this export should have the times and text edits correct, right?

Now create a static transcript of that sequence.

Transcript tab -> Import -> Import corrected transcript and pick the edited (source) transcript you exported.

In my test, it worked fine, and included my edits.

Stan