Exploring Timeline Control for Facial Motion Generation
Yifeng Ma, Jinwei Qi, Chaonan Ji, Peng Zhang, Bang Zhang, Zhidong Deng, Liefeng Bo

TL;DR
This paper presents a novel timeline control method for facial motion generation, enabling precise, fine-grained control over facial actions and their timing, with applications in text-guided animation.
Contribution
It introduces a timeline control signal for facial motion generation, along with a diffusion-based model and a framework for annotating facial actions at frame-level granularity.
Findings
Accurately annotates facial action intervals with minimal human effort
Generates natural facial motions aligned with specified timelines
Supports text-guided facial motion generation using ChatGPT
Abstract
This paper introduces a new control signal for facial motion generation: timeline control. Compared to audio and text signals, timelines provide more fine-grained control, such as generating specific facial motions with precise timing. Users can specify a multi-track timeline of facial actions arranged in temporal intervals, allowing precise control over the timing of each action. To model the timeline control capability, We first annotate the time intervals of facial actions in natural facial motion sequences at a frame-level granularity. This process is facilitated by Toeplitz Inverse Covariance-based Clustering to minimize human labor. Based on the annotations, we propose a diffusion-based generation model capable of generating facial motions that are natural and accurately aligned with input timelines. Our method supports text-guided motion generation by using ChatGPT to convert…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis
