M3DDM+: An improved video outpainting by a modified masking strategy

Takuya Murakawa; Takumi Fukuzawa; Ning Ding; Toru Tamaki

arXiv:2601.11048·cs.CV·January 19, 2026

M3DDM+: An improved video outpainting by a modified masking strategy

Takuya Murakawa, Takumi Fukuzawa, Ning Ding, Toru Tamaki

PDF

Open Access

TL;DR

M3DDM+ enhances video outpainting quality and temporal consistency by aligning training masking strategies with inference requirements, especially in challenging scenarios with limited motion or large outpainting regions.

Contribution

It introduces a modified masking strategy and fine-tuning process to improve video outpainting quality and coherence in latent diffusion models.

Findings

01

Significant improvement in visual fidelity and temporal coherence.

02

Maintains computational efficiency.

03

Effective in scenarios with limited inter-frame information.

Abstract

M3DDM provides a computationally efficient framework for video outpainting via latent diffusion modeling. However, it exhibits significant quality degradation -- manifested as spatial blur and temporal inconsistency -- under challenging scenarios characterized by limited camera motion or large outpainting regions, where inter-frame information is limited. We identify the cause as a training-inference mismatch in the masking strategy: M3DDM's training applies random mask directions and widths across frames, whereas inference requires consistent directional outpainting throughout the video. To address this, we propose M3DDM+, which applies uniform mask direction and width across all frames during training, followed by fine-tuning of the pretrained M3DDM model. Experiments demonstrate that M3DDM+ substantially improves visual fidelity and temporal coherence in information-limited scenarios…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Image and Video Quality Assessment · Image Enhancement Techniques