If I am interpreting this correctly, why are the pauses in text-based editing measured in seconds and percentages? It is also very inaccurate. Sometimes, a pause of "0.8s" in the text window is 16 frames/25fps in the timeline window. What is the logic here? Also, the AI might then decide to detect that pause on a good day, but not always. Then it will decide that the audible human speech ends at -60 to 80DB is not where to make the cut, but rather whole vowel parts of words. The accuarcy of making a cut automatically needs to be much more accurate.
What I see is very inaccurate. Only a cut detects a pause there.
No pause detected but there are multiple in this sentenceA cut shows this pause is 0.4 sYet the minmum pause is set to less than that and it cant even detect it without a cut.
Now, the pause length is always measured manually from the timeline. In our modern age, filling in that number does not need a separate calculation from frames in a second to percentages. Even if the increment is too small for the frame, it at least makes it more beneficial to adjust the pause length instead of constantly clipping too much or trimming manually every time. Now, there is a lot of needless guesswork. Make it more accurate. If the frames per second case were already calculated for me, I just work with the number I already work with from the timeline. Don't use percentages.
This needs to be more intuitive. Let the software calculate if it needs a percentage. Or make me mark the pause length with a in and out more acuratly to search for or make it based on seconds and the frame rate in the timeline. Have 2 boxes to enter this instead of just one. Then make it tell me it detects sounds where there might be clipped vowels. Every speaker is different every time, so I need to adjust it based on the average pause of the current speaker.