EasyOmnimatte: Taming Pretrained Inpainting Diffusion Models for End-to-End Video Layered Decomposition

Yihan Hu; Xuelin Chen; Xiaodong Cun

arXiv:2512.21865·cs.CV·December 29, 2025

EasyOmnimatte: Taming Pretrained Inpainting Diffusion Models for End-to-End Video Layered Decomposition

Yihan Hu, Xuelin Chen, Xiaodong Cun

PDF

Open Access

TL;DR

EasyOmnimatte introduces a unified, end-to-end method for video layered decomposition that finetunes a pretrained diffusion model with a dual-expert approach, improving quality and efficiency over prior multi-stage pipelines.

Contribution

It proposes the first unified, end-to-end video omnimatte method using dual experts with selective LoRA finetuning, enhancing decomposition quality and computational efficiency.

Findings

01

Sets new state-of-the-art in video omnimatte quality.

02

Reduces computational cost by avoiding multiple diffusion passes.

03

Effectively captures foreground effects and structure.

Abstract

Existing video omnimatte methods typically rely on slow, multi-stage, or inference-time optimization pipelines that fail to fully exploit powerful generative priors, producing suboptimal decompositions. Our key insight is that, if a video inpainting model can be finetuned to remove the foreground-associated effects, then it must be inherently capable of perceiving these effects, and hence can also be finetuned for the complementary task: foreground layer decomposition with associated effects. However, although na\"ively finetuning the inpainting model with LoRA applied to all blocks can produce high-quality alpha mattes, it fails to capture associated effects. Our systematic analysis reveals this arises because effect-related cues are primarily encoded in specific DiT blocks and become suppressed when LoRA is applied across all blocks. To address this, we introduce EasyOmnimatte, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Image Enhancement Techniques · Advanced Image Processing Techniques