RIGID: Recurrent GAN Inversion and Editing of Real Face Videos
Yangyang Xu, Shengfeng He, Kwan-Yee K. Wong, Ping Luo

TL;DR
RIGID introduces a recurrent framework for temporally coherent GAN inversion and editing of real face videos, improving consistency and quality over existing methods without retraining for different edits.
Contribution
The paper proposes a novel end-to-end recurrent approach that models temporal relations for consistent video GAN inversion and editing, addressing issues of inconsistency and noise.
Findings
Outperforms state-of-the-art in inversion quality.
Achieves more consistent facial editing across frames.
Effective for arbitrary editing without retraining.
Abstract
GAN inversion is indispensable for applying the powerful editability of GAN to real images. However, existing methods invert video frames individually often leading to undesired inconsistent results over time. In this paper, we propose a unified recurrent framework, named \textbf{R}ecurrent v\textbf{I}deo \textbf{G}AN \textbf{I}nversion and e\textbf{D}iting (RIGID), to explicitly and simultaneously enforce temporally coherent GAN inversion and facial editing of real videos. Our approach models the temporal relations between current and previous frames from three aspects. To enable a faithful real video reconstruction, we first maximize the inversion fidelity and consistency by learning a temporal compensated latent code. Second, we observe incoherent noises lie in the high-frequency domain that can be disentangled from the latent space. Third, to remove the inconsistency after attribute…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
RIGID: Recurrent GAN Inversion and Editing of Real Face Videos· youtube
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Speech and Audio Processing
