Skip to main content
Participating Frequently
August 29, 2023

Char Anim's "Compute Lipsync Take From Audio and Transcript" can't detect w's

  • August 29, 2023
  • 1 reply
  • 209 views

I've been using Character Animator's "Compute Lipsync Take From Audio and Transcript" for quite a while now, and I've noticed that, almost always, it fails to detect "w" in words that starts with it (i.e. what, where, etc.). I don't know if it is just me, but whenever I use that feature, I would have to go back from the start of the timeline and fix the "w's". My Lip Sync's Viseme Detection is already cranked up to the max in my Preferences settings. I don't know if there's like a fix for this or this is just how it is right now (if it's the latter, hopefully it can get fixed).

1 reply

Community Manager
August 31, 2023

I got curious and recorded a bit of audio saying "Who, what, when, where, and why?" and indeed it only used a W viseme on the first one and that's actually not for the W, it's for the "Oo" at the end of the word. I tried a bunch of other W words and noticed the same thing.

So here's a fun peek under the hood. Inside the app resources the (plain text) file `lex-phonealign/model/dictionary` has the full list of words that the transcript based aligner recognizes directly along with their phonetic decompositions (multiple for words like tomato for toe-may-toe and toe-mah-toe). There's another bit of code that handles words not in the dictionary so that things like names and nonsense words (I used Jabberwocky to test it) can still work. You'll note that there are more different codes than there are visemes, so there's a lookup table in the Lua code that describes how those get converted to visemes.

Of all those phonetic codes, only 3 map to the W-Oo viseme and all the uses I see are for the Oo and not the W.

UW0 = "W-Oo",
UW1 = "W-Oo",
UW2 = "W-Oo",


Also, you're correct that the W is treated as silence (the only letter that is, actually). I'd have to ask internally about how that lookup table from phoneme to viseme was decided. I suspect it is because the Oo viseme tends to be really dramatic and for a lot of words, it'd look weird to have a strong "Oo" for a brief W that immediately turns into some other sound. I haven't tried this, but you may be able to experiment and get what you want by tweaking the dictionary. I unfortunately don't have an easy way to tweak the phoneme to viseme map currently.

 

Hopefully that helps.


Dan Tull
Adobe Character Animator Team