VisionNVS: Self-Supervised Inpainting for Novel View Synthesis under the Virtual-Shift Paradigm

Hongbo Lu; Liang Yao; Chenghao He; Fan Liu; Wenlong Liao; Tao He; Pai Peng

arXiv:2603.17382·cs.CV·March 19, 2026

VisionNVS: Self-Supervised Inpainting for Novel View Synthesis under the Virtual-Shift Paradigm

Hongbo Lu, Liang Yao, Chenghao He, Fan Liu, Wenlong Liao, Tao He, Pai Peng

PDF

Open Access

TL;DR

VisionNVS introduces a self-supervised inpainting approach for novel view synthesis in autonomous driving, leveraging a Virtual-Shift paradigm and monocular depth proxies to improve geometric fidelity and visual quality without requiring ground truth for unseen views.

Contribution

It reformulates view synthesis as a self-supervised inpainting task using Virtual-Shift, eliminating the domain gap and enabling training solely on raw images.

Findings

01

Outperforms LiDAR-dependent baselines in geometric fidelity.

02

Achieves superior visual quality in novel view synthesis.

03

Provides a scalable solution for driving simulation.

Abstract

A fundamental bottleneck in Novel View Synthesis (NVS) for autonomous driving is the inherent supervision gap on novel trajectories: models are tasked with synthesizing unseen views during inference, yet lack ground truth images for these shifted poses during training. In this paper, we propose VisionNVS, a camera-only framework that fundamentally reformulates view synthesis from an ill-posed extrapolation problem into a self-supervised inpainting task. By introducing a ``Virtual-Shift'' strategy, we use monocular depth proxies to simulate occlusion patterns and map them onto the original view. This paradigm shift allows the use of raw, recorded images as pixel-perfect supervision, effectively eliminating the domain gap inherent in previous approaches. Furthermore, we address spatial consistency through a Pseudo-3D Seam Synthesis strategy, which integrates visual data from adjacent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging · 3D Shape Modeling and Analysis