TL;DR
EmoFace is a novel audio-driven method that generates emotionally expressive 3D facial animations with synchronized lip movements, natural eye behaviors, and applicability to MetaHuman models, supported by a new emotional audio-visual dataset.
Contribution
We introduce EmoFace, a new approach for emotional 3D face animation that incorporates emotion-aware encoding, post-processing for realism, and a dedicated dataset for MetaHuman applications.
Findings
Outperforms existing methods in emotional facial animation quality
Successfully generates natural blinks and eye movements
Demonstrates effectiveness in virtual reality and game NPCs
Abstract
Audio-driven emotional 3D face animation aims to generate emotionally expressive talking heads with synchronized lip movements. However, previous research has often overlooked the influence of diverse emotions on facial expressions or proved unsuitable for driving MetaHuman models. In response to this deficiency, we introduce EmoFace, a novel audio-driven methodology for creating facial animations with vivid emotional dynamics. Our approach can generate facial expressions with multiple emotions, and has the ability to generate random yet natural blinks and eye movements, while maintaining accurate lip synchronization. We propose independent speech encoders and emotion encoders to learn the relationship between audio, emotion and corresponding facial controller rigs, and finally map into the sequence of controller values. Additionally, we introduce two post-processing techniques…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
