VLP: Vision Language Planning for Autonomous Driving

Chenbin Pan; Burhaneddin Yaman; Tommaso Nesti; Abhirup Mallik,; Alessandro G Allievi; Senem Velipasalar; Liu Ren

arXiv:2401.05577·cs.CV·November 26, 2024·1 cites

VLP: Vision Language Planning for Autonomous Driving

Chenbin Pan, Burhaneddin Yaman, Tommaso Nesti, Abhirup Mallik,, Alessandro G Allievi, Senem Velipasalar, Liu Ren

PDF

Open Access

TL;DR

VLP introduces a vision-language planning framework that leverages language models to improve autonomous driving by enhancing scene understanding, reasoning, and generalization, achieving state-of-the-art results on NuScenes.

Contribution

The paper presents VLP, a novel framework integrating language models into autonomous driving to address reasoning and generalization challenges.

Findings

01

Achieves 35.9% reduction in average L2 error

02

Achieves 60.5% reduction in collision rates

03

Improves performance in long-tail scenarios

Abstract

Autonomous driving is a complex and challenging task that aims at safe motion planning through scene understanding and reasoning. While vision-only autonomous driving methods have recently achieved notable performance, through enhanced scene understanding, several key issues, including lack of reasoning, low generalization performance and long-tail scenarios, still need to be addressed. In this paper, we present VLP, a novel Vision-Language-Planning framework that exploits language models to bridge the gap between linguistic understanding and autonomous driving. VLP enhances autonomous driving systems by strengthening both the source memory foundation and the self-driving car's contextual understanding. VLP achieves state-of-the-art end-to-end planning performance on the challenging NuScenes dataset by achieving 35.9\% and 60.5\% reduction in terms of average L2 error and collision…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Topic Modeling