Orbis: Overcoming Challenges of Long-Horizon Prediction in Driving World Models
Arian Mousakhan, Sudhanshu Mittal, Silvio Galesso, Karim Farid, Thomas Brox

TL;DR
This paper introduces Orbis, a world model for autonomous driving that achieves state-of-the-art long-horizon prediction performance using simple design choices, without extra supervision or sensors, and compares discrete versus continuous token models.
Contribution
Orbis demonstrates that a simple, parameter-efficient model can outperform complex models in long-horizon driving scenarios without additional supervision.
Findings
Continuous autoregressive models are more robust and powerful than discrete token models.
The proposed model performs well in challenging scenarios like urban traffic and turning maneuvers.
Simple design choices can lead to state-of-the-art performance in autonomous driving world models.
Abstract
Existing world models for autonomous driving struggle with long-horizon generation and generalization to challenging scenarios. In this work, we develop a model using simple design choices, and without additional supervision or sensors, such as maps, depth, or multiple cameras. We show that our model yields state-of-the-art performance, despite having only 469M parameters and being trained on 280h of video data. It particularly stands out in difficult scenarios like turning maneuvers and urban traffic. We test whether discrete token models possibly have advantages over continuous models based on flow matching. To this end, we set up a hybrid tokenizer that is compatible with both approaches and allows for a side-by-side comparison. Our study concludes in favor of the continuous autoregressive model, which is less brittle on individual design choices and more powerful than the model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTraffic Prediction and Management Techniques
