Foundation Models for Trajectory Planning in Autonomous Driving: A Review of Progress and Open Challenges
Kemal Oksuz, Alexandru Buburuzan, Anthony Knittel, Yuhan Yao, Puneet K. Dokania

TL;DR
This paper reviews recent advances in foundation models for autonomous driving trajectory planning, highlighting their architectures, capabilities, limitations, and open challenges in the field.
Contribution
It provides a comprehensive taxonomy and critical evaluation of 37 recent foundation-model-based trajectory planning approaches in autonomous driving.
Findings
Foundation models enable direct trajectory inference from raw sensory data.
Multi-modal models incorporating natural language expand application scope.
Many approaches lack open-source code and datasets, limiting reproducibility.
Abstract
The emergence of multi-modal foundation models has markedly transformed the technology for autonomous driving, shifting away from conventional and mostly hand-crafted design choices towards unified, foundation-model-based approaches, capable of directly inferring motion trajectories from raw sensory inputs. This new class of methods can also incorporate natural language as an additional modality, with Vision-Language-Action (VLA) models serving as a representative example. In this review, we provide a comprehensive examination of such methods through a unifying taxonomy to critically evaluate their architectural design choices, methodological strengths, and their inherent capabilities and limitations. Our survey covers 37 recently proposed approaches that span the landscape of trajectory planning with foundation models. Furthermore, we assess these approaches with respect to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Path Planning Algorithms · Autonomous Vehicle Technology and Safety · Multimodal Machine Learning Applications
