KeyframeFace: Language-Driven Facial Animation via Semantic Keyframes
Jingchao Wu, Zejian Kang, Haibo Liu, Yuanchen Fei, Xiangru Huang

TL;DR
KeyframeFace introduces a novel language-driven facial animation framework using semantic keyframes, enhancing control, interpretability, and editing precision over traditional dense motion regression methods.
Contribution
The paper proposes a semantic keyframe-based approach for facial animation driven by language, leveraging large language models and a new multimodal dataset for improved fidelity and interpretability.
Findings
Semantic keyframe supervision improves expression fidelity.
Language priors enhance semantic alignment of facial animations.
Constructed a multimodal dataset with 2,100 scripts and annotated keyframes.
Abstract
Facial animation is a core component for creating digital characters in Computer Graphics (CG) industry. A typical production workflow relies on sparse, semantically meaningful keyframes to precisely control facial expressions. Enabling such animation directly from natural-language descriptions could significantly improve content creation efficiency and accessibility. However, most existing methods adopt a text-to-continuous-frames paradigm, directly regressing dense facial motion trajectories from language. This formulation entangles high-level semantic intent with low-level motion, lacks explicit semantic control structure, and limits precise editing and interpretability. Inspired by the keyframe paradigm in animation production, we propose KeyframeFace, a framework for semantic facial animation from language via interpretable keyframes. Instead of predicting dense motion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
