Tokenize the World into Object-level Knowledge to Address Long-tail   Events in Autonomous Driving

Ran Tian; Boyi Li; Xinshuo Weng; Yuxiao Chen; Edward Schmerling; Yue; Wang; Boris Ivanovic; and Marco Pavone

arXiv:2407.00959·cs.AI·July 2, 2024·2 cites

Tokenize the World into Object-level Knowledge to Address Long-tail Events in Autonomous Driving

Ran Tian, Boyi Li, Xinshuo Weng, Yuxiao Chen, Edward Schmerling, Yue, Wang, Boris Ivanovic, and Marco Pavone

PDF

Open Access

TL;DR

This paper introduces TOKEN, a multi-modal large language model that tokenizes object-level knowledge to improve autonomous driving in rare, long-tail scenarios, significantly reducing errors and collisions.

Contribution

We propose TOKEN, a novel MM-LLM that enhances autonomous vehicle planning by leveraging object-level scene representations and reasoning alignment to address long-tail event challenges.

Findings

01

27% reduction in trajectory L2 error

02

39% decrease in collision rates

03

Outperforms existing frameworks in long-tail scenarios

Abstract

The autonomous driving industry is increasingly adopting end-to-end learning from sensory inputs to minimize human biases in system design. Traditional end-to-end driving models, however, suffer from long-tail events due to rare or unseen inputs within their training distributions. To address this, we propose TOKEN, a novel Multi-Modal Large Language Model (MM-LLM) that tokenizes the world into object-level knowledge, enabling better utilization of LLM's reasoning capabilities to enhance autonomous vehicle planning in long-tail scenarios. TOKEN effectively alleviates data scarcity and inefficient tokenization by leveraging a traditional end-to-end driving model to produce condensed and semantically enriched representations of the scene, which are optimized for LLM planning compatibility through deliberate representation and reasoning alignment training stages. Our results demonstrate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAutonomous Vehicle Technology and Safety · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)