Beyond Editing Pairs: Fine-Grained Instructional Image Editing via Multi-Scale Learnable Regions
Chenrui Ma, Xi Xiao, Tianyang Wang, Yanning Shen

TL;DR
This paper introduces a novel instruction-driven image editing method that uses multi-scale learnable regions and leverages large text-image datasets, achieving high-fidelity, precise, and instruction-consistent edits without relying on editing pair datasets.
Contribution
The work proposes a new paradigm for image editing that utilizes widely available text-image pairs and multi-scale learnable regions, surpassing existing dataset-dependent and dataset-free methods.
Findings
Achieves state-of-the-art performance on various benchmarks.
Demonstrates high adaptability across different generative models.
Provides precise and instruction-consistent image editing results.
Abstract
Current text-driven image editing methods typically follow one of two directions: relying on large-scale, high-quality editing pair datasets to improve editing precision and diversity, or exploring alternative dataset-free techniques. However, constructing large-scale editing datasets requires carefully designed pipelines, is time-consuming, and often results in unrealistic samples or unwanted artifacts. Meanwhile, dataset-free methods may suffer from limited instruction comprehension and restricted editing capabilities. Faced with these challenges, the present work develops a novel paradigm for instruction-driven image editing that leverages widely available and enormous text-image pairs, instead of relying on editing pair datasets. Our approach introduces a multi-scale learnable region to localize and guide the editing process. By treating the alignment between images and their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOpen Education and E-Learning
