RIGID: Recurrent GAN Inversion and Editing of Real Face Videos

Yangyang Xu; Shengfeng He; Kwan-Yee K. Wong; Ping Luo

arXiv:2308.06097·cs.CV·August 16, 2023

RIGID: Recurrent GAN Inversion and Editing of Real Face Videos

Yangyang Xu, Shengfeng He, Kwan-Yee K. Wong, Ping Luo

PDF

Open Access 1 Video

TL;DR

RIGID introduces a recurrent framework for temporally coherent GAN inversion and editing of real face videos, improving consistency and quality over existing methods without retraining for different edits.

Contribution

The paper proposes a novel end-to-end recurrent approach that models temporal relations for consistent video GAN inversion and editing, addressing issues of inconsistency and noise.

Findings

01

Outperforms state-of-the-art in inversion quality.

02

Achieves more consistent facial editing across frames.

03

Effective for arbitrary editing without retraining.

Abstract

GAN inversion is indispensable for applying the powerful editability of GAN to real images. However, existing methods invert video frames individually often leading to undesired inconsistent results over time. In this paper, we propose a unified recurrent framework, named \textbf{R}ecurrent v\textbf{I}deo \textbf{G}AN \textbf{I}nversion and e\textbf{D}iting (RIGID), to explicitly and simultaneously enforce temporally coherent GAN inversion and facial editing of real videos. Our approach models the temporal relations between current and previous frames from three aspects. To enable a faithful real video reconstruction, we first maximize the inversion fidelity and consistency by learning a temporal compensated latent code. Second, we observe incoherent noises lie in the high-frequency domain that can be disentangled from the latent space. Third, to remove the inconsistency after attribute…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

RIGID: Recurrent GAN Inversion and Editing of Real Face Videos· youtube

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Speech and Audio Processing