Emu Edit: Precise Image Editing via Recognition and Generation Tasks
Shelly Sheynin, Adam Polyak, Uriel Singer, Yuval Kirstain, Amit Zohar,, Oron Ashual, Devi Parikh, Yaniv Taigman

TL;DR
Emu Edit is a multi-task image editing model that achieves state-of-the-art results by training on diverse generative tasks, utilizing learned task embeddings, and demonstrating strong generalization to new tasks with minimal data.
Contribution
The paper introduces Emu Edit, a novel multi-task image editing framework that unifies various editing tasks as generative problems and enhances performance with learned task embeddings.
Findings
Achieves state-of-the-art performance in instruction-based image editing.
Successfully generalizes to new tasks with few labeled examples.
Provides a new benchmark with seven diverse image editing tasks.
Abstract
Instruction-based image editing holds immense potential for a variety of applications, as it enables users to perform any editing operation using a natural language instruction. However, current models in this domain often struggle with accurately executing user instructions. We present Emu Edit, a multi-task image editing model which sets state-of-the-art results in instruction-based image editing. To develop Emu Edit we train it to multi-task across an unprecedented range of tasks, such as region-based editing, free-form editing, and Computer Vision tasks, all of which are formulated as generative tasks. Additionally, to enhance Emu Edit's multi-task learning abilities, we provide it with learned task embeddings which guide the generation process towards the correct edit type. Both these elements are essential for Emu Edit's outstanding performance. Furthermore, we show that Emu Edit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Cell Image Analysis Techniques
