EX-4D: EXtreme Viewpoint 4D Video Synthesis via Depth Watertight Mesh

Tao Hu; Haoyang Peng; Xiao Liu; Yuewen Ma

arXiv:2506.05554·cs.CV·June 9, 2025

EX-4D: EXtreme Viewpoint 4D Video Synthesis via Depth Watertight Mesh

Tao Hu, Haoyang Peng, Xiao Liu, Yuewen Ma

PDF

Open Access

TL;DR

EX-4D introduces a novel depth watertight mesh representation and a training strategy to generate high-quality, physically consistent 4D videos from monocular input, especially under extreme viewpoints.

Contribution

The paper presents a new depth watertight mesh framework and a simulated masking training method for 4D video synthesis from monocular videos.

Findings

01

Outperforms state-of-the-art in extreme-view quality

02

Ensures geometric consistency in challenging viewpoints

03

Produces temporally coherent, high-quality videos

Abstract

Generating high-quality camera-controllable videos from monocular input is a challenging task, particularly under extreme viewpoint. Existing methods often struggle with geometric inconsistencies and occlusion artifacts in boundaries, leading to degraded visual quality. In this paper, we introduce EX-4D, a novel framework that addresses these challenges through a Depth Watertight Mesh representation. The representation serves as a robust geometric prior by explicitly modeling both visible and occluded regions, ensuring geometric consistency in extreme camera pose. To overcome the lack of paired multi-view datasets, we propose a simulated masking strategy that generates effective training data only from monocular videos. Additionally, a lightweight LoRA-based video diffusion adapter is employed to synthesize high-quality, physically consistent, and temporally coherent videos. Extensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis