1 Matching Annotations
- Nov 2022
blog.duolingo.com blog.duolingo.com
To create accurate animations, we generate the speech, run it through our in-house speech recognition and pronunciation models, and get the timing for each word and phoneme (speech sound). Each sound is mapped onto a visual representation, or viseme, in a set we designed based on linguistic features.
viseme, an atomic speech visualization of a particular phoneme