SynergyWarpNet: Attention-Guided Cooperative Warping for Neural Portrait Animation
Shihang Li, Zhiqiang Gong, Minming Ye, Yue Gao, Wen Yao

TL;DR
SynergyWarpNet is a novel attention-guided framework for neural portrait animation that progressively refines talking head synthesis by combining explicit warping, reference-based correction, and confidence-guided fusion, achieving state-of-the-art results.
Contribution
It introduces a three-stage cooperative warping approach integrating 3D optical flow, cross-attention with multiple references, and confidence-guided fusion for high-fidelity portrait animation.
Findings
Achieves state-of-the-art performance on benchmark datasets.
Effectively handles occlusions and distortions in portrait animation.
Outperforms existing methods in visual quality and motion accuracy.
Abstract
Recent advances in neural portrait animation have demonstrated remarked potential for applications in virtual avatars, telepresence, and digital content creation. However, traditional explicit warping approaches often struggle with accurate motion transfer or recovering missing regions, while recent attention-based warping methods, though effective, frequently suffer from high complexity and weak geometric grounding. To address these issues, we propose SynergyWarpNet, an attention-guided cooperative warping framework designed for high-fidelity talking head synthesis. Given a source portrait, a driving image, and a set of reference images, our model progressively refines the animation in three stages. First, an explicit warping module performs coarse spatial alignment between the source and driving image using 3D dense optical flow. Next, a reference-augmented correction module leverages…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Multimodal Machine Learning Applications
