InstantDrag: Improving Interactivity in Drag-based Image Editing
Joonghyuk Shin, Daehyeon Choi, Jaesik Park

TL;DR
InstantDrag is a novel, optimization-free method that enables real-time, photo-realistic drag-based image editing using only an image and a drag instruction, significantly improving interactivity and speed.
Contribution
We introduce InstantDrag, a new pipeline with two networks that learns motion dynamics for drag editing without masks or text prompts, enhancing interactivity and efficiency.
Findings
Enables fast, photo-realistic edits without masks or prompts
Operates in real-time for interactive applications
Demonstrates effectiveness on facial and general scenes
Abstract
Drag-based image editing has recently gained popularity for its interactivity and precision. However, despite the ability of text-to-image models to generate samples within a second, drag editing still lags behind due to the challenge of accurately reflecting user interaction while maintaining image content. Some existing approaches rely on computationally intensive per-image optimization or intricate guidance-based methods, requiring additional inputs such as masks for movable regions and text prompts, thereby compromising the interactivity of the editing process. We introduce InstantDrag, an optimization-free pipeline that enhances interactivity and speed, requiring only an image and a drag instruction as input. InstantDrag consists of two carefully designed networks: a drag-conditioned optical flow generator (FlowGen) and an optical flow-conditioned diffusion model (FlowDiffusion).…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputer Graphics and Visualization Techniques
MethodsDiffusion
