A Unified and Controllable Framework for Layered Image Generation with Visual Effects
Jinrui Yang, Qing Liu, Yijun Li, Mengwei Ren, Letian Zhang, Zhe Lin, Cihang Xie, Yuyin Zhou

TL;DR
LASAGNA is a unified framework for layered image generation that produces photorealistic effects and supports diverse edits in a single pass, eliminating the need for multiple models and reducing identity drift.
Contribution
It introduces LASAGNA, a novel single-pass layered image generation model that incorporates visual effects and supports flexible editing without post-processing.
Findings
LASAGNA outperforms prior methods in quality and editability.
It supports diverse conditional inputs within a unified architecture.
The paper releases LASAGNA-48K dataset and LASAGNA-BENCH benchmark.
Abstract
Recent image generation models produce impressive composites, but often fail to preserve the identity of user-provided content when editing specific elements: the surrounding scene may shift, and even the edited object's appearance can drift from the original. Layered representation offer a natural remedy--they allow users to independently manipulate individual elements--but existing layered methods typically produce transparent foregrounds without realistic visual effects such as shadows and reflections, forcing the use of a second harmonization model after every edit, which in turn introduces drift. To overcome these limitations, we present LASAGNA, which generates a photorealistic background (BG) and an RGBA foreground with compelling visual effects in a single forward pass. By treating object-associated visual effects as part of the foreground (FG) layer, LASAGNA supports the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
