MIMAFace: Face Animation via Motion-Identity Modulated Appearance Feature Learning
Yue Han, Junwei Zhu, Yuxiang Feng, Xiaozhong Ji, Keke He, Xiangtai Li,, zhucun xue, Yong Liu

TL;DR
This paper introduces MIMAFace, a novel face animation method that enhances temporal stability and identity preservation by modulating appearance features at motion and identity levels, addressing quality gaps in public datasets.
Contribution
The paper proposes the MIA and ICA modules that improve appearance feature learning and temporal consistency in face animation, advancing beyond existing diffusion-based methods.
Findings
Achieves precise facial motion control and identity preservation.
Generates animation videos with high intra/inter-clip temporal consistency.
Demonstrates superior performance on various datasets.
Abstract
Current diffusion-based face animation methods generally adopt a ReferenceNet (a copy of U-Net) and a large amount of curated self-acquired data to learn appearance features, as robust appearance features are vital for ensuring temporal stability. However, when trained on public datasets, the results often exhibit a noticeable performance gap in image quality and temporal consistency. To address this issue, we meticulously examine the essential appearance features in the facial animation tasks, which include motion-agnostic (e.g., clothing, background) and motion-related (e.g., facial details) texture components, along with high-level discriminative identity features. Drawing from this analysis, we introduce a Motion-Identity Modulated Appearance Learning Module (MIA) that modulates CLIP features at both motion and identity levels. Additionally, to tackle the semantic/ color…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Video Surveillance and Tracking Methods · Human Motion and Animation
MethodsContrastive Language-Image Pre-training
