End-to-End Visual Editing with a Generatively Pre-Trained Artist

Andrew Brown; Cheng-Yang Fu; Omkar Parkhi; Tamara L. Berg; Andrea; Vedaldi

arXiv:2205.01668·cs.CV·May 4, 2022·1 cites

End-to-End Visual Editing with a Generatively Pre-Trained Artist

Andrew Brown, Cheng-Yang Fu, Omkar Parkhi, Tamara L. Berg, Andrea, Vedaldi

PDF

Open Access

TL;DR

This paper introduces an end-to-end, self-supervised transformer-based model for targeted image editing that learns to blend regions based on simulated edits, outperforming previous methods in quality and efficiency.

Contribution

The paper presents a novel self-supervised training approach for image editing using a transformer, eliminating the need for real edit examples and enabling intuitive control over blending effects.

Findings

01

Outperforms prior methods in edit quality and efficiency

02

Uses self-supervised augmentation to simulate training data

03

Demonstrates superior results across multiple datasets

Abstract

We consider the targeted image editing problem: blending a region in a source image with a driver image that specifies the desired change. Differently from prior works, we solve this problem by learning a conditional probability distribution of the edits, end-to-end. Training such a model requires addressing a fundamental technical challenge: the lack of example edits for training. To this end, we propose a self-supervised approach that simulates edits by augmenting off-the-shelf images in a target domain. The benefits are remarkable: implemented as a state-of-the-art auto-regressive transformer, our approach is simple, sidesteps difficulties with previous methods based on GAN-like priors, obtains significantly better edits, and is efficient. Furthermore, we show that different blending effects can be learned by an intuitive control of the augmentation process, with no other changes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning