Click2Mask: Local Editing with Dynamic Mask Generation
Omer Regev, Omri Avrahami, Dani Lischinski

TL;DR
Click2Mask introduces a user-friendly local image editing method that dynamically generates masks from a single point, enabling precise content addition with minimal user effort and surpassing existing techniques in quality.
Contribution
The paper presents a novel dynamic mask generation approach that simplifies local editing by requiring only a single reference point, improving usability and accuracy over prior methods.
Findings
Outperforms state-of-the-art methods in local editing quality.
Reduces user effort by eliminating the need for detailed masks.
Enables flexible object addition unconstrained by existing segments.
Abstract
Recent advancements in generative models have revolutionized image generation and editing, making these tasks accessible to non-experts. This paper focuses on local image editing, particularly the task of adding new content to a loosely specified area. Existing methods often require a precise mask or a detailed description of the location, which can be cumbersome and prone to errors. We propose Click2Mask, a novel approach that simplifies the local editing process by requiring only a single point of reference (in addition to the content description). A mask is dynamically grown around this point during a Blended Latent Diffusion (BLD) process, guided by a masked CLIP-based semantic loss. Click2Mask surpasses the limitations of segmentation-based and fine-tuning dependent methods, offering a more user-friendly and contextually accurate solution. Our experiments demonstrate that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications
MethodsDiffusion
