Unified Long Video Inpainting and Outpainting via Overlapping High-Order Co-Denoising

Shuangquan Lyu; Steven Mao; Yue Ma

arXiv:2511.03272·cs.CV·November 6, 2025

Unified Long Video Inpainting and Outpainting via Overlapping High-Order Co-Denoising

Shuangquan Lyu, Steven Mao, Yue Ma

PDF

Open Access

TL;DR

This paper presents a unified method for long video inpainting and outpainting that extends diffusion models with high fidelity and temporal consistency, enabling arbitrarily long, spatially edited videos.

Contribution

The authors introduce a novel approach combining LoRA fine-tuning and overlap-and-blend co-denoising for high-quality, long-range video editing without seams or drift.

Findings

01

Outperforms baseline methods in PSNR/SSIM metrics

02

Enables editing over hundreds of frames

03

Maintains high perceptual realism (LPIPS)

Abstract

Generating long videos remains a fundamental challenge, and achieving high controllability in video inpainting and outpainting is particularly demanding. To address both of these challenges simultaneously and achieve controllable video inpainting and outpainting for long video clips, we introduce a novel and unified approach for long video inpainting and outpainting that extends text-to-video diffusion models to generate arbitrarily long, spatially edited videos with high fidelity. Our method leverages LoRA to efficiently fine-tune a large pre-trained video diffusion model like Alibaba's Wan 2.1 for masked region video synthesis, and employs an overlap-and-blend temporal co-denoising strategy with high-order solvers to maintain consistency across long sequences. In contrast to prior work that struggles with fixed-length clips or exhibits stitching artifacts, our system enables…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis · Image Enhancement Techniques