CanonSwap: High-Fidelity and Consistent Video Face Swapping via Canonical Space Modulation

Xiangyang Luo; Ye Zhu; Yunfei Liu; Lijian Lin; Cong Wan; Zijian Cai; Shao-Lun Huang; Yu Li

arXiv:2507.02691·cs.CV·July 4, 2025

CanonSwap: High-Fidelity and Consistent Video Face Swapping via Canonical Space Modulation

Xiangyang Luo, Ye Zhu, Yunfei Liu, Lijian Lin, Cong Wan, Zijian Cai, Shao-Lun Huang, Yu Li

PDF

TL;DR

CanonSwap introduces a novel video face swapping framework that decouples motion and appearance information in a canonical space, enabling high-fidelity, consistent, and realistic identity transfer while preserving dynamic facial attributes.

Contribution

The paper proposes CanonSwap, a new method that separates motion from appearance for improved identity transfer and dynamic attribute preservation in video face swapping.

Findings

01

Outperforms existing methods in visual quality and temporal consistency

02

Achieves superior identity preservation with minimal artifacts

03

Provides comprehensive metrics for evaluating face swapping performance

Abstract

Video face swapping aims to address two primary challenges: effectively transferring the source identity to the target video and accurately preserving the dynamic attributes of the target face, such as head poses, facial expressions, lip-sync, \etc. Existing methods mainly focus on achieving high-quality identity transfer but often fall short in maintaining the dynamic attributes of the target face, leading to inconsistent results. We attribute this issue to the inherent coupling of facial appearance and motion in videos. To address this, we propose CanonSwap, a novel video face-swapping framework that decouples motion information from appearance information. Specifically, CanonSwap first eliminates motion-related information, enabling identity modification within a unified canonical space. Subsequently, the swapped feature is reintegrated into the original video space, ensuring the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.