Localize, Understand, Collaborate: Semantic-Aware Dragging via Intention   Reasoner

Xing Cui; Peipei Li; Zekun Li; Xuannan Liu; Yueying Zou; Zhaofeng He

arXiv:2406.00432·cs.CV·October 23, 2024

Localize, Understand, Collaborate: Semantic-Aware Dragging via Intention Reasoner

Xing Cui, Peipei Li, Zekun Li, Xuannan Liu, Yueying Zou, Zhaofeng He

PDF

Open Access 1 Repo

TL;DR

LucidDrag introduces a semantic-aware editing framework that infers multiple editing intentions and guides image manipulation to improve flexibility and quality in drag-based editing tasks.

Contribution

It shifts from deterministic drag estimation to a multi-strategy intention reasoning approach with collaborative guidance for enhanced editing control.

Findings

01

Outperforms previous methods in qualitative assessments.

02

Achieves higher editing accuracy and image quality.

03

Demonstrates robustness across diverse editing scenarios.

Abstract

Flexible and accurate drag-based editing is a challenging task that has recently garnered significant attention. Current methods typically model this problem as automatically learning "how to drag" through point dragging and often produce one deterministic estimation, which presents two key limitations: 1) Overlooking the inherently ill-posed nature of drag-based editing, where multiple results may correspond to a given input, as illustrated in Fig.1; 2) Ignoring the constraint of image quality, which may lead to unexpected distortion. To alleviate this, we propose LucidDrag, which shifts the focus from "how to drag" to "what-then-how" paradigm. LucidDrag comprises an intention reasoner and a collaborative guidance sampling mechanism. The former infers several optimal editing strategies, identifying what content and what semantic direction to be edited. Based on the former, the latter…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cuixing100876/luciddrag-neurips2024
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Topic Modeling · Natural Language Processing Techniques

MethodsFocus