Vid2Sim: Realistic and Interactive Simulation from Video for Urban   Navigation

Ziyang Xie; Zhizheng Liu; Zhenghao Peng; Wayne Wu; Bolei Zhou

arXiv:2501.06693·cs.CV·January 15, 2025

Vid2Sim: Realistic and Interactive Simulation from Video for Urban Navigation

Ziyang Xie, Zhizheng Liu, Zhenghao Peng, Wayne Wu, Bolei Zhou

PDF

Open Access

TL;DR

Vid2Sim introduces a scalable framework that converts monocular videos into photorealistic 3D urban environments, significantly enhancing the training and deployment of navigation agents in real-world scenarios.

Contribution

This work presents a novel real2sim pipeline that bridges the sim2real gap by generating realistic 3D environments from videos for reinforcement learning.

Findings

01

Improves urban navigation success rate by 31.2% in digital twins.

02

Enhances real-world navigation success rate by 68.3%.

03

Demonstrates effective bridging of sim2real gap with video-based environment generation.

Abstract

Sim-to-real gap has long posed a significant challenge for robot learning in simulation, preventing the deployment of learned models in the real world. Previous work has primarily focused on domain randomization and system identification to mitigate this gap. However, these methods are often limited by the inherent constraints of the simulation and graphics engines. In this work, we propose Vid2Sim, a novel framework that effectively bridges the sim2real gap through a scalable and cost-efficient real2sim pipeline for neural 3D scene reconstruction and simulation. Given a monocular video as input, Vid2Sim can generate photorealistic and physically interactable 3D simulation environments to enable the reinforcement learning of visual navigation agents in complex urban environments. Extensive experiments demonstrate that Vid2Sim significantly improves the performance of urban navigation in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGeographic Information Systems Studies · 3D Modeling in Geospatial Applications · Evacuation and Crowd Dynamics