MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing
Kai Zhang, Lingbo Mo, Wenhu Chen, Huan Sun, Yu Su

TL;DR
MagicBrush is a large, manually annotated dataset designed to improve instruction-guided image editing, enabling better training of models that produce more accurate and realistic edits compared to existing methods.
Contribution
We created the first large-scale, manually annotated dataset for instruction-guided image editing, covering diverse scenarios and enabling significant improvements in model performance.
Findings
Fine-tuned InstructPix2Pix on MagicBrush yields better image quality.
Current baselines struggle with the dataset's complexity.
The dataset exposes gaps between existing methods and real-world editing needs.
Abstract
Text-guided image editing is widely needed in daily life, ranging from personal use to professional applications such as Photoshop. However, existing methods are either zero-shot or trained on an automatically synthesized dataset, which contains a high volume of noise. Thus, they still require lots of manual tuning to produce desirable outcomes in practice. To address this issue, we introduce MagicBrush (https://osu-nlp-group.github.io/MagicBrush/), the first large-scale, manually annotated dataset for instruction-guided real image editing that covers diverse scenarios: single-turn, multi-turn, mask-provided, and mask-free editing. MagicBrush comprises over 10K manually annotated triplets (source image, instruction, target image), which supports trainining large-scale text-guided image editing models. We fine-tune InstructPix2Pix on MagicBrush and show that the new model can produce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Generative Adversarial Networks and Image Synthesis
