KeyframeFace: Language-Driven Facial Animation via Semantic Keyframes

Jingchao Wu; Zejian Kang; Haibo Liu; Yuanchen Fei; Xiangru Huang

arXiv:2512.11321·cs.CV·May 13, 2026

KeyframeFace: Language-Driven Facial Animation via Semantic Keyframes

Jingchao Wu, Zejian Kang, Haibo Liu, Yuanchen Fei, Xiangru Huang

PDF

TL;DR

KeyframeFace introduces a novel language-driven facial animation framework using semantic keyframes, enhancing control, interpretability, and editing precision over traditional dense motion regression methods.

Contribution

The paper proposes a semantic keyframe-based approach for facial animation driven by language, leveraging large language models and a new multimodal dataset for improved fidelity and interpretability.

Findings

01

Semantic keyframe supervision improves expression fidelity.

02

Language priors enhance semantic alignment of facial animations.

03

Constructed a multimodal dataset with 2,100 scripts and annotated keyframes.

Abstract

Facial animation is a core component for creating digital characters in Computer Graphics (CG) industry. A typical production workflow relies on sparse, semantically meaningful keyframes to precisely control facial expressions. Enabling such animation directly from natural-language descriptions could significantly improve content creation efficiency and accessibility. However, most existing methods adopt a text-to-continuous-frames paradigm, directly regressing dense facial motion trajectories from language. This formulation entangles high-level semantic intent with low-level motion, lacks explicit semantic control structure, and limits precise editing and interpretability. Inspired by the keyframe paradigm in animation production, we propose KeyframeFace, a framework for semantic facial animation from language via interpretable keyframes. Instead of predicting dense motion…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.