Temporally Consistent Semantic Video Editing
Yiran Xu, Badour AlBahar, Jia-Bin Huang

TL;DR
This paper introduces a method for temporally consistent semantic video editing using GANs, reducing flickering artifacts by optimizing latent codes and the generator to ensure smooth, coherent edits across video frames.
Contribution
The proposed approach effectively minimizes temporal inconsistencies in GAN-based video editing, improving the quality and coherence of edited videos compared to existing methods.
Findings
Reduces flickering artifacts in edited videos
Achieves more consistent semantic edits across frames
Outperforms baseline methods in quality and coherence
Abstract
Generative adversarial networks (GANs) have demonstrated impressive image generation quality and semantic editing capability of real images, e.g., changing object classes, modifying attributes, or transferring styles. However, applying these GAN-based editing to a video independently for each frame inevitably results in temporal flickering artifacts. We present a simple yet effective method to facilitate temporally coherent video editing. Our core idea is to minimize the temporal photometric inconsistency by optimizing both the latent code and the pre-trained generator. We evaluate the quality of our editing on different domains and GAN inversion techniques and show favorable results against the baselines.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging · Digital Media Forensic Detection
