Progress in animation of an EMA-controlled tongue model for acoustic-visual speech synthesis
Ingmar Steiner (INRIA Lorraine - LORIA), Slim Ouni (INRIA Lorraine -, LORIA)

TL;DR
This paper introduces a new method for animating a 3D tongue model in speech synthesis by combining EMA motion capture data with MRI-derived surface extraction, enhancing visual speech realism.
Contribution
It presents a novel skeletal animation technique for a deformable tongue model using EMA data and MRI surface extraction, advancing visual speech synthesis.
Findings
Initial animation results demonstrate feasibility
Method integrates EMA and MRI data effectively
Future work aims to improve animation accuracy
Abstract
We present a technique for the animation of a 3D kinematic tongue model, one component of the talking head of an acoustic-visual (AV) speech synthesizer. The skeletal animation approach is adapted to make use of a deformable rig controlled by tongue motion capture data obtained with electromagnetic articulography (EMA), while the tongue surface is extracted from volumetric magnetic resonance imaging (MRI) data. Initial results are shown and future work outlined.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCleft Lip and Palate Research · Voice and Speech Disorders · Speech and Audio Processing
