• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Char Anim's "Compute Lipsync Take From Audio and Transcript" can't detect w's

Community Beginner ,
Aug 29, 2023 Aug 29, 2023

Copy link to clipboard

Copied

I've been using Character Animator's "Compute Lipsync Take From Audio and Transcript" for quite a while now, and I've noticed that, almost always, it fails to detect "w" in words that starts with it (i.e. what, where, etc.). I don't know if it is just me, but whenever I use that feature, I would have to go back from the start of the timeline and fix the "w's". My Lip Sync's Viseme Detection is already cranked up to the max in my Preferences settings. I don't know if there's like a fix for this or this is just how it is right now (if it's the latter, hopefully it can get fixed).

Bug Unresolved
TOPICS
General , Performance

Views

36

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Adobe Employee , Aug 30, 2023 Aug 30, 2023

I got curious and recorded a bit of audio saying "Who, what, when, where, and why?" and indeed it only used a W viseme on the first one and that's actually not for the W, it's for the "Oo" at the end of the word. I tried a bunch of other W words and noticed the same thing.

So here's a fun peek under the hood. Inside the app resources the (plain text) file `lex-phonealign/model/dictionary` has the full list of words that the transcript based aligner recognizes directly along with their phonetic de

...

Votes

Translate

Translate
1 Comment
Adobe Employee ,
Aug 30, 2023 Aug 30, 2023

Copy link to clipboard

Copied

LATEST

I got curious and recorded a bit of audio saying "Who, what, when, where, and why?" and indeed it only used a W viseme on the first one and that's actually not for the W, it's for the "Oo" at the end of the word. I tried a bunch of other W words and noticed the same thing.

So here's a fun peek under the hood. Inside the app resources the (plain text) file `lex-phonealign/model/dictionary` has the full list of words that the transcript based aligner recognizes directly along with their phonetic decompositions (multiple for words like tomato for toe-may-toe and toe-mah-toe). There's another bit of code that handles words not in the dictionary so that things like names and nonsense words (I used Jabberwocky to test it) can still work. You'll note that there are more different codes than there are visemes, so there's a lookup table in the Lua code that describes how those get converted to visemes.

Of all those phonetic codes, only 3 map to the W-Oo viseme and all the uses I see are for the Oo and not the W.

UW0 = "W-Oo",
UW1 = "W-Oo",
UW2 = "W-Oo",


Also, you're correct that the W is treated as silence (the only letter that is, actually). I'd have to ask internally about how that lookup table from phoneme to viseme was decided. I suspect it is because the Oo viseme tends to be really dramatic and for a lot of words, it'd look weird to have a strong "Oo" for a brief W that immediately turns into some other sound. I haven't tried this, but you may be able to experiment and get what you want by tweaking the dictionary. I unfortunately don't have an easy way to tweak the phoneme to viseme map currently.

 

Hopefully that helps.


Dan Tull
Adobe Character Animator Team

Votes

Translate

Translate

Report

Report