FaceEditTalker: Controllable Talking Head Generation with Facial Attribute Editing
Guanwen Feng, Zhiyuan Ma, Yunan Li, Jiahao Yang, Junwei Jing, Qiguang Miao

TL;DR
FaceEditTalker introduces a unified framework for controllable facial attribute editing in audio-driven talking head videos, enabling personalized and adaptable digital avatars with high synchronization and visual quality.
Contribution
The paper presents a novel method combining semantic feature editing with audio-driven video synthesis for flexible facial attribute control in talking head generation.
Findings
Achieves high lip-sync accuracy and visual fidelity.
Enables flexible editing of facial attributes like hairstyle and accessories.
Demonstrates superior performance over baseline methods.
Abstract
Recent advances in audio-driven talking head generation have achieved impressive results in lip synchronization and emotional expression. However, they largely overlook the crucial task of facial attribute editing. This capability is indispensable for achieving deep personalization and expanding the range of practical applications, including user-tailored digital avatars, engaging online education content, and brand-specific digital customer service. In these key domains, flexible adjustment of visual attributes, such as hairstyle, accessories, and subtle facial features, is essential for aligning with user preferences, reflecting diverse brand identities and adapting to varying contextual demands. In this paper, we present FaceEditTalker, a unified framework that enables controllable facial attribute manipulation while generating high-quality, audio-synchronized talking head videos.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
