Skip to main content
Participant
May 20, 2026
Open for Voting

Audio/Speech generation UI based off Premiere's captioning UI

  • May 20, 2026
  • 0 replies
  • 10 views

The current beta UI for speech generation is not viable for projects longer than a few minutes. Even a 5 minute project that takes advantage of the range of features currently available could easily an unmanageable number of files, and turn what should be a straightforward process into a tedious task of wading through bad edits & duplicates created in error in search of the 6 or so that represent your completed work.

Instead, implement a text-block work flow where the system accepts the entire project text, parses it into text blocks (sized as willed) which the user then works with in a hybridization of the captioning conventions (merging/splitting text blocks, editing or adding text, etc.) and the speech generation features (voice, speed, pitch, tone/emotion, etc.). 

Additionally, allow the users to preview partial blocks or bridges between blocks, and to generate individual, selected, or all blocks.

I believe this model of UI maximizes the opportunity to apply current and future editing features of speech generation at a granular level without the incumbent tedium experienced in the current beta.

It also provides users with the ability to experiment, improve, and advance their skills using the features, increasing their satisfaction with your product, while simultaneously mitigating the aspects of the current experience that decreases satisfaction and/or engenders animosity (i.e., credit loss due to error, bad result, unintended duplication, etc.).

Below is a summary of the expressed or implied features defined by implementing the Premiere caption model for speech generation:

  • Entire project text is loaded at once (eliminating cut & paste errors and more)
  • Text-block work flow:
    • Stay in the groove, not being pulled out every 5 minutes to cut, paste, and find your place again
    • Focus on the quality: more opportunity to explore and experiment with the tools available
    • Precision Edits reduce cost/frustration/tedium while enhancing customer experience and product
    • Project save files become a library of reusable audio clips, format presets, and even dj samples

 

Okay, I may not have exhausted the available benefits, but I have exhausted myself on the topic for the moment. Hopefully I have not exhausted you, the reader as well.

Peace.

BeMo