Towards Small Object Editing: A Benchmark Dataset and A Training-Free   Approach

Qihe Pan; Zhen Zhao; Zicheng Wang; Sifan Long; Yiming Wu; Wei Ji,; Haoran Liang; Ronghua Liang

arXiv:2411.01545·cs.CV·November 5, 2024

Towards Small Object Editing: A Benchmark Dataset and A Training-Free Approach

Qihe Pan, Zhen Zhao, Zicheng Wang, Sifan Long, Yiming Wu, Wei Ji,, Haoran Liang, Ronghua Liang

PDF

1 Repo

TL;DR

This paper introduces a training-free method for small object image editing guided by text, addressing alignment issues in diffusion models, and provides a new benchmark dataset for evaluation.

Contribution

The paper presents a novel training-free approach with attention guidance for small object editing and introduces SOEBench, a standardized benchmark for evaluation.

Findings

01

Significant improvement in small object fidelity and accuracy

02

Effective alignment of cross-modal attention maps

03

Benchmark dataset facilitates standardized evaluation

Abstract

A plethora of text-guided image editing methods has recently been developed by leveraging the impressive capabilities of large-scale diffusion-based generative models especially Stable Diffusion. Despite the success of diffusion models in producing high-quality images, their application to small object generation has been limited due to difficulties in aligning cross-modal attention maps between text and these objects. Our approach offers a training-free method that significantly mitigates this alignment issue with local and global attention guidance , enhancing the model's ability to accurately render small objects in accordance with textual descriptions. We detail the methodology in our approach, emphasizing its divergence from traditional generation techniques and highlighting its advantages. What's more important is that we also provide~\textit{SOEBench} (Small Object Editing), a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

panqihe-zjut/SOEBench
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSoftmax · Attention Is All You Need · Diffusion