Skip to main content
Participant
June 1, 2023

Import corrected transcript(txt) bug

  • June 1, 2023
  • 13 replies
  • 3515 views

I have tried this in the current and beta versions of Premiere, and both give me strange issues.
I sent out a 1.5 hour long edit to proofreading for adjustment.  I copied the corrections into a text document and tried to Text > Import > Import Corrected Transcript (TXT) and I get horrible results.
I have the original TXT file so I tried to import that - also horrible results.
first half has the time code and speaker name but <Type your caption here>  until about 00:43:50:00 when it starts filling in the captions from the beginning but with the wrong speaker,  The text is off, and towarrds the end it just puts all of the rest of the text under the last timecode.
Same results in both versions.  Using both the original exported text file and the corrected file.
The Text corrections took the proofreader over a day to review - to stat again and work in SRT format would be costly - am i missing something? 

13 replies

Stan Jones
Community Expert
Community Expert
July 1, 2024

@cameraman_nick, I may never reach an end to my experiments!

 

Did the revised .txt translation work better? Let us know.

 

For me there are several issues.

 

Does the "import corrected" (IC) work for what I believe is its limited purpose - round tripping to get spelling and other basic edits? Generally, with a few exceptions for a bug or two, the answer has been yes.

 

Does IC work for substituting an externally produced transcript? Answer - sometimes. The script/transcript must be only the spoken words. As content is added (or non-spoken subtitles such as [Music] etc, the result may not hold up. But limited additions may.

 

Does IC work for importing a translation of the PR generated transcript? As I say in the other post, I am now concluding that logically this should NOT work well (unless a new feature is added to do so). We have no control over the timecodes, and they cannot be edited. Translations are not just word for word - idioms and other changes make the translation shorter or longer than the original, and I doubt that IC is set up to retime the imported transcript to fit within the original timecode range. If the goal is to do Text-based editing in the translated language, I don't see how it can work reliably.

 

For example, if I move the 2 sentences added to segment one back to segment two, it throws the timing off. And captions generated from that revision are incorrect.

 

So far, I think that if the goal is burned in or sidecar captions, it is better to do the translation import at that stage. There you have control of the timecodes and the words.

 

@Kerstin Ebert @Alexander_DVA Can you correct anything I am misunderstanding? Do we have a roadmap for ongoing work to help translations at the transcript or the captions stage?

 

Stan

 

 

Stan Jones
Community Expert
Community Expert
June 29, 2024

@cameraman_nick,

 

The spanish .txt file is ANSI (not UTF8 as preferred) and uses Mac line ending (CR only, not Linux LF only nor Windows CRLF).

 

Converted to UTF8 did not help, but converting the line endings did. (I used Notepad++ on a PC to do this.) There are significant differences regarding the content in particular segments, and significant differences in the translated captions produced from the English transcription and the spanish corrected transcription.

 

I have previously experimented with using "import corrected" for translations, and results have varied. But logically this should NOT work well - transcription in the spoken language means that the AI logic behind it can match word for word. But the translation cannot fit word for word. For example, the first segment in English is 283 characters including spaces. The Spanish is 297. And naturally, the translator puts all that text in the first segment, which lasts 25 seconds. When the translation is imported as corrected, that first segment is 432 characters, and includes 2 full sentences from the next segment.

 

What is the significance of translations such as "construcci—n". I wonder if that is throwing it off.

 

Still poking at this....

 

Stan

 

 

 

Stan Jones
Community Expert
Community Expert
June 28, 2024

@cameraman_nick,

 

I can test this. I sent you a PM.

 

Stan

 

Participating Frequently
June 28, 2024

Has anyone found a solution to this issue? I'm running into the same problem now. The English version was transcribed to Spanish - the timecode and speaker names did not change between the .txt files. When I import the corrected transcript - the Spanish version - the timecode and text is all over the place with big gaps. I've included an image of the PP transcript panel with the incorrect text/timecode and an image of the correct .txt file.Working in PP v24.5 Any insight on how to fix this? 

 

 

 

 

 

Stan Jones
Community Expert
Community Expert
January 15, 2024

@Alysha25885702p6j5,

 

I just tested with a 2:30 English clip. I used your GoogleDocs method and translated into Spanish and German. The results looked pretty good.

 

The translated .txt file has the same timecodes. PR does adjust some timecodes a bit on import.

 

I see places where the character count varies from the original (as expected). I suspect that if it varies much, that can throught the import off.

 

What languages are you translating from and to? How long is your clip?

 

I can test if you can provide an mp3 of the audio.

 

Stan

 

Participant
January 14, 2024

Also encountering this bug as described by Vince. I will try to download the Beta 24.2 to see if there's any change. My workflow has been to export .txt then copy-paste into google doc and use the translate tool which creates a translated copy of the document without altering any of the timecode formatting. When I've tried to import corrected transcript .txt I get the same glitchy outcomes where <Type your caption here> appears for most of the transcript and the timecodes are jumbled.

Kerstin Ebert
Adobe Employee
Adobe Employee
November 20, 2023

Hi @firste13845131,

 

we recently added some improvements to the 24.1 version to fix a bug where importing a corrected transcript as .txt file would create larger gaps in the transcript that's shown in the Text panel. Are you still seeing any issues in the 24.1 release version (or the 24.2 beta)?.

 

Regarding your last comment, do you only see the bug with non-English transcripts, but it works fine with English transcripts? Could you share a screenshot of what it looks like after you imported the corrected transcript?

 

Thanks,

Kerstin

Participant
November 20, 2023

Still considering the fact I'm talking about a non-English transcription...

Participant
November 20, 2023

@Jorge33643669qtn4 I tried installing Premiere 2024 Beta last week, and it seems like it does work in this specific update.

In 'regular' Premiere 2024 you still can not transcribe an audio file, download the transcription as a text file, edit the text in a local editor and import it back into Premiere.

I was surprised to see it finally working in the Beta.

Stan Jones
Community Expert
Community Expert
November 19, 2023

@Jorge33643669qtn4,

 

What version are you using? The basic function has not been showing these issues.

 

Export transcript as .txt and import it "as corrected" without making ANY changes. What happens?

 

Stan