GoodDrag: Towards Good Practices for Drag Editing with Diffusion Models

Zewei Zhang; Huan Liu; Jun Chen; Xiangyu Xu

arXiv:2404.07206·cs.CV·April 11, 2024·1 cites

GoodDrag: Towards Good Practices for Drag Editing with Diffusion Models

Zewei Zhang, Huan Liu, Jun Chen, Xiangyu Xu

PDF

Open Access 1 Video 3 Reviews

TL;DR

GoodDrag introduces a novel diffusion-based framework with an AlDD approach and motion supervision to enhance drag editing stability and quality, supported by new datasets and metrics.

Contribution

It presents a new AlDD framework and motion supervision for drag editing, along with a benchmark dataset and quality metrics, advancing the state-of-the-art.

Findings

01

GoodDrag outperforms existing methods in quality and stability.

02

The new dataset Drag100 enables better benchmarking.

03

Proposed metrics accurately assess drag editing quality.

Abstract

In this paper, we introduce GoodDrag, a novel approach to improve the stability and image quality of drag editing. Unlike existing methods that struggle with accumulated perturbations and often result in distortions, GoodDrag introduces an AlDD framework that alternates between drag and denoising operations within the diffusion process, effectively improving the fidelity of the result. We also propose an information-preserving motion supervision operation that maintains the original features of the starting point for precise manipulation and artifact reduction. In addition, we contribute to the benchmarking of drag editing by introducing a new dataset, Drag100, and developing dedicated quality assessment metrics, Dragging Accuracy Index and Gemini Score, utilizing Large Multimodal Models. Extensive experiments demonstrate that the proposed GoodDrag compares favorably against the…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 8Confidence 4

Strengths

1. Paper comprehensively evaluates prior work and builds on the weaknesses through a simple alternating dragging & denoising step-framework. 2. The paper introduces a carefully thought out (natural + synthetic) hybrid evaluation dataset comprising diverse editing scenarios; in addition to two evaluation metrics that might be useful for the drag-editing community. Potential for higher impact. 3. Results look promising and the information preservation feature alignment loss is well grounded. 4. S

Weaknesses

1. It makes sense to compare against diffusion based editing techniques since the paper builds on the weaknesses of prior literature. However, it would be nice to still have DragGAN in the main qualitative figure (7) and the quantitative evaluations for the sake of completeness. 2. Training hardware and memory is not specified in the main text. Consider moving it from the supplementary to the main text?

Reviewer 02Rating 3Confidence 3

Strengths

1. This paper contributes a new dataset and metric for image editting. 2. The idea for solving accumulation distortion is interesting.

Weaknesses

1. This paper does not include comparisons with recent works [1, 2], which limits the context for understanding its contributions relative to the latest advancements. 2. It lacks critical experiments on the DragBench dataset [3], especially with the MD and IF metrics, which are widely used in existing literature for standardized evaluation. Due to above considertation, it is challenging to assess the performance of this paper. [1] Drag your noise: Interactive point-based editing via diffusion

Reviewer 03Rating 5Confidence 4

Strengths

1. The paper proposes an effective framework for achieving high-quality drag-based image editing results. 2. The results demonstrate a significant improvement in image fidelity during the drag editing process. 3. The authors introduce a new dataset and additional metrics for evaluating the point-based image editing task.

Weaknesses

1. The novelty of the ALDD design is limited, as it mainly alters the order of denoising during the drag updating process. A concern is whether separating the process using the hyperparameter 'B' is optimal. Is 'B' fixed, or does it require manual adjustment by the user? 2. There is confusion about whether "ALDD" and "AlDD" refer to the same concept. 3. There is an unnecessary blank area on Line 054 of the paper. 4. The authors claim that previous datasets are limited in terms of diverse drag ta

Videos

GoodDrag: Towards Good Practices for Drag Editing with Diffusion Models· slideslive

Taxonomy

TopicsFluid Dynamics and Turbulent Flows · Traffic control and management · Heat and Mass Transfer in Porous Media

MethodsDiffusion