End-to-End Visual Editing with a Generatively Pre-Trained Artist
Andrew Brown, Cheng-Yang Fu, Omkar Parkhi, Tamara L. Berg, Andrea, Vedaldi

TL;DR
This paper introduces an end-to-end, self-supervised transformer-based model for targeted image editing that learns to blend regions based on simulated edits, outperforming previous methods in quality and efficiency.
Contribution
The paper presents a novel self-supervised training approach for image editing using a transformer, eliminating the need for real edit examples and enabling intuitive control over blending effects.
Findings
Outperforms prior methods in edit quality and efficiency
Uses self-supervised augmentation to simulate training data
Demonstrates superior results across multiple datasets
Abstract
We consider the targeted image editing problem: blending a region in a source image with a driver image that specifies the desired change. Differently from prior works, we solve this problem by learning a conditional probability distribution of the edits, end-to-end. Training such a model requires addressing a fundamental technical challenge: the lack of example edits for training. To this end, we propose a self-supervised approach that simulates edits by augmenting off-the-shelf images in a target domain. The benefits are remarkable: implemented as a state-of-the-art auto-regressive transformer, our approach is simple, sidesteps difficulties with previous methods based on GAN-like priors, obtains significantly better edits, and is efficient. Furthermore, we show that different blending effects can be learned by an intuitive control of the augmentation process, with no other changes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
