GeoDrive: 3D Geometry-Informed Driving World Model with Precise Action Control
Anthony Chen, Wenzhao Zheng, Yida Wang, Xueyang Zhang, Kun Zhan, Peng Jia, Kurt Keutzer, Shanghang Zhang

TL;DR
GeoDrive introduces a 3D geometry-informed world model for autonomous driving that improves spatial understanding, action control, and scene editing, addressing previous deficiencies in geometric consistency and occlusion handling.
Contribution
The paper presents GeoDrive, a novel driving world model that explicitly incorporates 3D geometry and dynamic editing to enhance safety, accuracy, and scene manipulation in autonomous driving.
Findings
Outperforms existing models in action accuracy and 3D spatial awareness
Enables realistic and adaptable scene modeling for safer driving
Supports interactive scene editing and trajectory control
Abstract
Recent advancements in world models have revolutionized dynamic environment simulation, allowing systems to foresee future states and assess potential actions. In autonomous driving, these capabilities help vehicles anticipate the behavior of other road users, perform risk-aware planning, accelerate training in simulation, and adapt to novel scenarios, thereby enhancing safety and reliability. Current approaches exhibit deficiencies in maintaining robust 3D geometric consistency or accumulating artifacts during occlusion handling, both critical for reliable safety assessment in autonomous navigation tasks. To address this, we introduce GeoDrive, which explicitly integrates robust 3D geometry conditions into driving world models to enhance spatial understanding and action controllability. Specifically, we first extract a 3D representation from the input frame and then obtain its 2D…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · 3D Shape Modeling and Analysis · Computer Graphics and Visualization Techniques
