EmoDiffTalk:Emotion-aware Diffusion for Editable 3D Gaussian Talking Head
Chang Liu, Tianjiao Jing, Chengcheng Ma, Xuanqi Zhou, Zhengxuan Lian, Qin Jin, Hongliang Yuan, Shi-Sheng Huang

TL;DR
EmoDiffTalk introduces an emotion-aware diffusion framework for editable 3D Gaussian talking heads, enabling fine-grained, multimodal emotional editing with high fidelity and controllability, advancing the realism and expressiveness of 3D talking head synthesis.
Contribution
It presents a novel emotion-aware Gaussian diffusion method with an AU-based emotion controller for high-quality, editable 3D talking heads with continuous emotional expression editing.
Findings
Superior emotional subtlety demonstrated on datasets
High lip-sync fidelity achieved
Enhanced controllability over emotional expressions
Abstract
Recent photo-realistic 3D talking head via 3D Gaussian Splatting still has significant shortcoming in emotional expression manipulation, especially for fine-grained and expansive dynamics emotional editing using multi-modal control. This paper introduces a new editable 3D Gaussian talking head, i.e. EmoDiffTalk. Our key idea is a novel Emotion-aware Gaussian Diffusion, which includes an action unit (AU) prompt Gaussian diffusion process for fine-grained facial animator, and moreover an accurate text-to-AU emotion controller to provide accurate and expansive dynamic emotional editing using text input. Experiments on public EmoTalk3D and RenderMe-360 datasets demonstrate superior emotional subtlety, lip-sync fidelity, and controllability of our EmoDiffTalk over previous works, establishing a principled pathway toward high-quality, diffusion-driven, multimodal editable 3D talking-head…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Emotion and Mood Recognition
