Drag within Prior Distribution: Text-Conditioned Point-Based Image Editing within Distribution Constraints
Haoyang Hu, Masataka Seo, Yen-Wei Chen

TL;DR
This paper introduces a novel diffusion-based image editing method that maintains distribution consistency and semantic alignment during localized manipulations, addressing issues of artifacts and ambiguity in prior approaches.
Contribution
It proposes a CLIP-guided intermediate step evaluation, a prior-preservation loss for distribution consistency, and a directionally-weighted point tracking mechanism for improved accuracy and efficiency.
Findings
Enhanced semantic alignment in edited images.
Reduced artifacts and deviations from original data distribution.
Improved editing accuracy and speed in fine-grained tasks.
Abstract
Diffusion-based point editing methods have gained significant traction in image editing tasks due to their ability to manipulate image semantics and fine details by applying localized perturbations on the manifold of noise latent. However, these approaches face several limitations. Traditional point-based editing relies on pairs of handle and target points to define motion trajectories, which can introduce ambiguity or unnecessary alterations. Furthermore, when the distance between the handle and target points is large, the accumulated perturbations often cause the noise latent deviation from inversion score trajectory, resulting in unnatural artifacts. To address these issues in global editing tasks, we introduce a CLIP-based model to evaluate and guide intermediate editing steps, ensuring that the generated results remain both semantically aligned. Additionally, we propose a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
