Driving in the Occupancy World: Vision-Centric 4D Occupancy Forecasting   and Planning via World Models for Autonomous Driving

Yu Yang; Jianbiao Mei; Yukai Ma; Siliang Du; Wenqing Chen; Yijie Qian,; Yuxiang Feng; Yong Liu

arXiv:2408.14197·cs.CV·January 20, 2025

Driving in the Occupancy World: Vision-Centric 4D Occupancy Forecasting and Planning via World Models for Autonomous Driving

Yu Yang, Jianbiao Mei, Yukai Ma, Siliang Du, Wenqing Chen, Yijie Qian,, Yuxiang Feng, Yong Liu

PDF

Open Access 1 Video

TL;DR

This paper introduces Drive-OccWorld, a vision-centric 4D occupancy forecasting model that integrates semantic and motion information for end-to-end autonomous driving planning, enabling controllable and plausible future state generation.

Contribution

It proposes a novel 4D world model with semantic-motion normalization and flexible action conditioning for improved autonomous driving planning.

Findings

01

Accurately forecasts future occupancy and flow in 4D space.

02

Enables controllable generation with various action inputs.

03

Demonstrates superior performance on nuScenes and Lyft datasets.

Abstract

World models envision potential future states based on various ego actions. They embed extensive knowledge about the driving environment, facilitating safe and scalable autonomous driving. Most existing methods primarily focus on either data generation or the pretraining paradigms of world models. Unlike the aforementioned prior works, we propose Drive-OccWorld, which adapts a vision-centric 4D forecasting world model to end-to-end planning for autonomous driving. Specifically, we first introduce a semantic and motion-conditional normalization in the memory module, which accumulates semantic and dynamic information from historical BEV embeddings. These BEV features are then conveyed to the world decoder for future occupancy and flow forecasting, considering both geometry and spatiotemporal modeling. Additionally, we propose injecting flexible action conditions, such as velocity,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Driving in the Occupancy World: Vision-Centric 4D Occupancy Forecasting and Planning via World Models for Autonomous Driving· underline

Taxonomy

TopicsData Management and Algorithms · Transportation and Mobility Innovations

MethodsFocus