Skip to main content
Known Participant
April 17, 2024
Open for Voting

Combine text in live transcript and better export control

  • April 17, 2024
  • 18 replies
  • 1990 views

I am currently trying to export transcripts from Premiere Pro for a podcast episode monthly at this point.

There is still a lot of time spent after every time the CSV file is exported as that is the only way to take the text to quickly remove the timecode and combine speaker lines into paragraphs.

If only in the premiere, I could combine the text in the timeline transcript. Every time there isn't a time between audio clips and the premiere can recognise that as detecting where ripple deletes on a gap can be applied and see that as a reason to combine text back into a paragraph. Either at the premiere, that can be cleaned up, and the recombined speaker turns in paragraphs of what happens in the transcript export process. It would also be nice to export a doc file with the option of having paragraphs and turning off the timecode if need be.

18 replies

Known Participant
April 22, 2024

Thank you for showing me. I will try that.

No, the export of the transcript from the timeline view still has separated sentence parts when I export.

 

First little bit:


Salma Uche Okeke
Our guest today is Selma

00:00:42:21 - 00:00:43:17
Salma Uche Okeke
Etareri

00:00:43:17 - 00:00:44:16
Salma Uche Okeke
Born in

00:00:44:16 - 00:00:46:12
Salma Uche Okeke
1968, in Bad

00:00:46:13 - 00:00:47:17
Salma Uche Okeke
Hofgstein



Stan Jones
Community Expert
Community Expert
April 18, 2024

Yes, that is a great, practical use-case. And I think a workaround is  possible.

 

If using a text editor to modify, the advantage to exporting the transcript as .txt is that each part to be kept/removed is on separate lines.

 

Manually merge transcript entries for the same speaker to get what you want. Or, for now, live with smaller segments/paragraphs and multiple speaker labels. I have some ideas on  how to handle that, but not time to test.

 

So, for example, here's a sample:

00:00:01:07 - 00:01:18:19
Bill
Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. 

00:01:18:20 - 00:02:28:24
Mary
Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. Some text on one or more lines. 

 

I am using my favorite (PC only?) - Notepad++

 

Ctrl+Home (Go to top)

Start recording macro

Shft+Down-arrow (selects timecode line)

Delete key (deletes timecode and line, First line is now "Bill", cursor is at beginning of line)

End key (cursor is right after "Bill" and before the line ending)

Type a colon and space (First line is now Bill: - Or use whatever formatting you want.)

Delete key (deletes line ending, First line is now "Bill: Some text ....")

Ctrl+F (Find)

Enter 00: (This works as long as the program is under one hour. A more flexible approach would need modification.)

Click "Find Next" (Cursor is at beginning of next timecode. May have 00: selected)

Home (moves cursor to beginning of line, without selection. We are now in the position in the next transcript segment where we started the macro in the first)

 

Stop recording macro

Save the macro

 

To process a .txt transcript, open it, Ctrl_home to be sure you are at the beginning.

Macro -> Run a macro multiple times

Pick your saved macro and check "run until end of file"

 

There is a flaw in the last entry, and possibly other problems that I have not tested. But this gives you an idea.

 

Let us know how it works (or doesn't).

 

Did you get the merge/export to work?

 

Stan

 

Known Participant
April 18, 2024

Yes, the task is to take the transcript and provide it as an easy-to-read document to podcast listeners.

Stan Jones
Community Expert
Community Expert
April 18, 2024

Johannes,

 

I hear you. I don't know C++ and don't have a link. I see no reason that a variety of approaches might not work. The task is to take the predictable PR format and keep some info (speaker name and transcript text) and remove other info (timecodes).

 

I would not buy Excel just for this purpose; it is just what I happened to have learned over various projects (urelated to video editing).

 

Let me clarify your goal. The exported transcript is to make it available as text for the podcast? Or to use it temporarily as part of the editing process?

 

Stan

 

Known Participant
April 18, 2024

Stan, if you know how something works. Please encourage everyone to learn. Feint of heart or not, getting better at this is important to me.

Barring needing to buy Excel just for this, do you have a link somewhere on how to write something simple in C++? It sounds worthwhile to use if it works every time, not like Premiere Pro, on the other hand.

Stan Jones
Community Expert
Community Expert
April 17, 2024

> The sentence parts in the source, if merged into paragraphs, do not show up as combined in CSV according to my testing.

It is merged in my test. The merged segments are grouped by a single speaker and timecode, then the merged text.

 

> What does the macro code for this look like? Does it only work in Excel? Will Google Sheets or Libre Office also accept it?

The code would have to be written. No, only Excel. I don't think any of the other options will run Excel macros (a version of VBA). And no, this is not for the faint of heart.

 

An alternative would be to use something like Notepad++ and program a keystroke macro there.

 

Stan

 

 

 

Known Participant
April 17, 2024

The sentence parts in the source, if merged into paragraphs, do not show up as combined in CSV according to my testing.

What does the macro code for this look like? Does it only work in Excel? Will Google Sheets or Libre Office also accept it?

Stan Jones
Community Expert
Community Expert
April 17, 2024

@Blue-Marble,

 

Upvoted. Yes, there is no automatic way to combine transcript segments by speaker. And there is no control over export format. The developers are working on bulk selection of speakers, and I hope to see more progress in this area.

 

Workaround: Export as CSV, and use Excel macro programming to remove timecode, combine segments by speaker, format speaker name at beginning of each speaker segment. Once your code is done, this will be easier.

 

Workaround for combining segments: Manually combine speaker transcript segments in PR. If it is a source media transcript, be sure you are in source view. Drag-Select the segments you want to merge, then merge. The transcript will export with the merged segments.

 

If you haven't used merge since release version 23.4, see this post for a discussion of the "new" merge behaviors:

https://community.adobe.com/t5/premiere-pro-bugs/merge-captions-suddenly-gray-and-unlcikable/idc-p/13891108#M11033

 

Stan