FlowDrag: 3D-aware Drag-based Image Editing with Mesh-guided Deformation Vector Flow Fields
Gwanhyeong Koo, Sunjae Yoon, Younghwan Lee, Ji Woo Hong, Chang D. Yoo

TL;DR
FlowDrag introduces a 3D-aware mesh-guided deformation approach for image editing, improving geometric consistency and accuracy in drag-based manipulations, and provides a new benchmark dataset for evaluation.
Contribution
The paper presents FlowDrag, a novel method that integrates 3D mesh guidance into drag-based image editing, and introduces VFD, a benchmark dataset with ground-truth frames for better evaluation.
Findings
FlowDrag outperforms existing methods on VFD Bench and DragBench.
Mesh-guided deformation improves geometric consistency in edits.
The VFD dataset provides ground-truth for more accurate benchmarking.
Abstract
Drag-based editing allows precise object manipulation through point-based control, offering user convenience. However, current methods often suffer from a geometric inconsistency problem by focusing exclusively on matching user-defined points, neglecting the broader geometry and leading to artifacts or unstable edits. We propose FlowDrag, which leverages geometric information for more accurate and coherent transformations. Our approach constructs a 3D mesh from the image, using an energy function to guide mesh deformation based on user-defined drag points. The resulting mesh displacements are projected into 2D and incorporated into a UNet denoising process, enabling precise handle-to-target point alignment while preserving structural integrity. Additionally, existing drag-editing benchmarks provide no ground truth, making it difficult to assess how accurately the edits match the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis · Robot Manipulation and Learning
