Improving Editability in Image Generation with Layer-wise Memory
Daneul Kim, Jaeah Lee, Jaesik Park

TL;DR
This paper introduces a layer-wise memory framework for image generation that enhances sequential editing capabilities, allowing for natural, coherent modifications with minimal user input and maintaining content consistency across multiple edits.
Contribution
The paper proposes a novel layer-wise memory approach and associated techniques to improve sequential image editing, addressing limitations of existing methods in maintaining content coherence and natural integration of new elements.
Findings
Superior performance in iterative editing tasks
Effective preservation of scene coherence across edits
Minimal user effort required for high-quality results
Abstract
Most real-world image editing tasks require multiple sequential edits to achieve desired results. Current editing approaches, primarily designed for single-object modifications, struggle with sequential editing: especially with maintaining previous edits along with adapting new objects naturally into the existing content. These limitations significantly hinder complex editing scenarios where multiple objects need to be modified while preserving their contextual relationships. We address this fundamental challenge through two key proposals: enabling rough mask inputs that preserve existing content while naturally integrating new elements and supporting consistent editing across multiple modifications. Our framework achieves this through layer-wise memory, which stores latent representations and prompt embeddings from previous edits. We propose Background Consistency Guidance that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Computer Graphics and Visualization Techniques · Advanced Image and Video Retrieval Techniques
