ResetEdit: Precise Text-guided Editing of Generated Image via Resettable Starting Latent
Hanyi Wang, Han Fang, Zheng Wang, Shilin Wang, Ee-Chien Chang

TL;DR
ResetEdit introduces a novel diffusion model-based framework that embeds recoverable latent information during image generation, enabling precise and flexible post-generation editing with high fidelity.
Contribution
It proposes a proactive diffusion editing method that reconstructs a resettable latent, improving editing accuracy and consistency without requiring storing original latents.
Findings
ResetEdit outperforms state-of-the-art methods in controllability.
It achieves higher visual fidelity in image editing tasks.
The framework seamlessly integrates with existing tuning-free editing techniques.
Abstract
Recent advances in diffusion models have enabled high-quality image generation, leading to increasing demand for post-generation editing that modifies local regions while preserving global structure. Achieving such flexible and precise editing requires a high-quality starting point, a latent representation that provides both the freedom needed for diverse modifications and the precision required for fine-grained, region-specific control. However, existing inversion-based approaches such as DDIM inversion often yield unsatisfactory starting latents, resulting in degraded edit fidelity and structural inconsistency. Ideally, the most suitable editing anchor should be the original latent used during the generation process, as it inherently captures the scene's structure and semantics. Yet, storing this latent for every generated image is impractical due to massive storage and retrieval…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
