The impact of differences in facial features between real speakers and 3D face models on synthesized lip motions
Rabab Algadhy, Yoshihiko Gotoh, Steve Maddock

TL;DR
This study examines how differences in facial features between real speakers and 3D face models affect the accuracy of synthesized lip motions, emphasizing the importance of feature matching for realistic animation.
Contribution
It investigates the impact of facial feature mismatches on lip motion synthesis, highlighting the need for careful face feature alignment in 3D facial animation workflows.
Findings
Mismatch in mouth height between real and synthetic faces reduces lip motion quality
Proper face feature matching improves lip motion realism
Facial feature differences significantly influence lip motion accuracy
Abstract
Lip motion accuracy is important for speech intelligibility, especially for users who are hard of hearing or second language learners. A high level of realism in lip movements is also required for the game and film production industries. 3D morphable models (3DMMs) have been widely used for facial analysis and animation. However, factors that could influence their use in facial animation, such as the differences in facial features between recorded real faces and animated synthetic faces, have not been given adequate attention. This paper investigates the mapping between real speakers and similar and non-similar 3DMMs and the impact on the resulting 3D lip motion. Mouth height and mouth width are used to determine face similarity. The results show that mapping 2D videos of real speakers with low mouth heights to 3D heads that correspond to real speakers with high mouth heights, or vice…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
