Orion-Lite: Distilling LLM Reasoning into Efficient Vision-Only Driving Models
Jing Gu, Niccol\`o Cavagnero, Gijs Dubbelman

TL;DR
This paper introduces Orion-Lite, a compact vision-only model for autonomous driving that distills knowledge from large language models to improve reasoning and performance in complex, interactive scenarios.
Contribution
The work demonstrates effective LLM distillation into a vision-only model, surpassing larger models and setting new benchmarks in closed-loop autonomous driving tasks.
Findings
Orion-Lite outperforms its teacher model in complex scenarios.
Achieves a new state-of-the-art score of 80.6 on Bench2Drive.
Vision-only models have significant untapped potential for reactive planning.
Abstract
Leveraging the general world knowledge of Large Language Models (LLMs) holds significant promise for improving the ability of autonomous driving systems to handle rare and complex scenarios. While integrating LLMs into Vision-Language-Action (VLA) models has yielded state-of-the-art performance, their massive parameter counts pose severe challenges for latency-sensitive and energy-efficient deployment. Distilling LLM knowledge into a compact driving model offers a compelling solution to retain these reasoning capabilities while maintaining a manageable computational footprint. Although previous works have demonstrated the efficacy of distillation, these efforts have primarily focused on relatively simple scenarios and open-loop evaluations. Therefore, in this work, we investigate LLM distillation in more complex, interactive scenarios under closed-loop evaluation. We demonstrate that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
