EasyOmnimatte: Taming Pretrained Inpainting Diffusion Models for End-to-End Video Layered Decomposition
Yihan Hu, Xuelin Chen, Xiaodong Cun

TL;DR
EasyOmnimatte introduces a unified, end-to-end method for video layered decomposition that finetunes a pretrained diffusion model with a dual-expert approach, improving quality and efficiency over prior multi-stage pipelines.
Contribution
It proposes the first unified, end-to-end video omnimatte method using dual experts with selective LoRA finetuning, enhancing decomposition quality and computational efficiency.
Findings
Sets new state-of-the-art in video omnimatte quality.
Reduces computational cost by avoiding multiple diffusion passes.
Effectively captures foreground effects and structure.
Abstract
Existing video omnimatte methods typically rely on slow, multi-stage, or inference-time optimization pipelines that fail to fully exploit powerful generative priors, producing suboptimal decompositions. Our key insight is that, if a video inpainting model can be finetuned to remove the foreground-associated effects, then it must be inherently capable of perceiving these effects, and hence can also be finetuned for the complementary task: foreground layer decomposition with associated effects. However, although na\"ively finetuning the inpainting model with LoRA applied to all blocks can produce high-quality alpha mattes, it fails to capture associated effects. Our systematic analysis reveals this arises because effect-related cues are primarily encoded in specific DiT blocks and become suppressed when LoRA is applied across all blocks. To address this, we introduce EasyOmnimatte, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Image Enhancement Techniques · Advanced Image Processing Techniques
