Create a higher quality "compute lip sync from scene audio"

Report · Feb 07, 2023

Please create a setting(s) for creating a more accurate "lip sync from scene audio". Maybe leave the current way it's done for people who want it done with the current quality/time, but allow people to choose a much higher quality setting (and let the software run for 10x the current time if necessary). I currently spend many hours fixing the visemes for my dialogue (at least 30 hours for a 30 minute animation). There isn't one string of dialogue that I don't have to fix.

Is this possible?

Report · Feb 07, 2023

Does this help? I've been having a lot of the same issues. It may be a bit worse for me since my audio is in Portuguese. As some people suggested before, being able to input the text associated with the audio, having more options to expand the audio on the timeline or sync it with subtittles could help. For me one of the biggest drags and probably most time consuming part is editing the visemes. When I sync the project in after effects or edit the audio on audition I can narrow it down to much smaller time fractions making it better to sync things to the audio. Even if I was able to insert the viseme when editing the audio on audition or inserting the captions on after effects it would already be a great help. This would allow me to listen to a specific section just once instead of multiple times when editing the audio, multiple times in after effects to insert subtibles/add effects or time a video time remapping and then still having to change the visemes on character animator only being able to see a very small line on the timeline where the scene track is. It's a lot easier to pinpoint these things with a big waveform (physically being able to see it bigger on the screen of character animator) and being able to zoom in/see time fractions in more detail to match the waveform to the changes in visemes by just dragging it. By default the system forces it to attach to a frame or second having to alt click for a specific point in time or at times having to repeat this process multiple times because both the begining and the end of of the viseme don't match the frame. The video I'm working on now is a lot longer 7m:28s which is a lot of time to hear my voice over and over endlessly on 3 programs. Hopefully this assists you in helping us. I've seen there's a lot of requests concerning issues with visemes, support for other languages and more efficient ways to have visemes match the audio files.

Report · Feb 07, 2023

Yes! It seems like there should be a way to "Sensei" this particular problem.

It would be great to have an opt in in the settings of the program itself to "Send anonymous data" to Adobe with the audio file and the corrected visemes. (It should already know how the software calculated it.)

..but I can see that taking a bunch of programming resources.

Howabout a request to everyone (like hundreds or thousands) to send in their files (like you have requested here). Feed those 1000's of files to the Sensei and see what happens.

(...I am assuming at some point Sensei will become self aware and use its new viseme powers to more closely emulate a human and end us all - but in the meantime it would save me some time fixing lip sync. Thanks!)

Report · Feb 07, 2023

Is it possible to provide us with your finished edits and original audio for us to compare what's automatic versus your edited (desired) output? This will help us train the software.

Create a higher quality "compute lip sync from scene audio"

1 Pinned Reply