UltraEdit: Instruction-based Fine-Grained Image Editing at Scale
Haozhe Zhao, Xiaojian Ma, Liang Chen, Shuzheng Si, Rujie Wu, Kaikai, An, Peiyu Yu, Minjia Zhang, Qing Li, Baobao Chang

TL;DR
UltraEdit introduces a large-scale, high-quality dataset for instruction-based image editing, leveraging real images, diverse instructions, and region annotations to improve editing models and benchmarks.
Contribution
It provides a novel, extensive dataset for image editing that combines real images, diverse instructions, and region annotations, addressing limitations of prior datasets.
Findings
Models trained on UltraEdit achieve new state-of-the-art results on benchmarks.
Real image anchors and region-based data significantly improve editing performance.
UltraEdit enables broader and more accurate image editing capabilities.
Abstract
This paper presents UltraEdit, a large-scale (approximately 4 million editing samples), automatically generated dataset for instruction-based image editing. Our key idea is to address the drawbacks in existing image editing datasets like InstructPix2Pix and MagicBrush, and provide a systematic approach to producing massive and high-quality image editing samples. UltraEdit offers several distinct advantages: 1) It features a broader range of editing instructions by leveraging the creativity of large language models (LLMs) alongside in-context editing examples from human raters; 2) Its data sources are based on real images, including photographs and artworks, which provide greater diversity and reduced bias compared to datasets solely generated by text-to-image models; 3) It also supports region-based editing, enhanced by high-quality, automatically produced region annotations. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques · Generative Adversarial Networks and Image Synthesis
MethodsSparse Evolutionary Training
