Testing Large Language Models on Driving Theory Knowledge and Skills for Connected Autonomous Vehicles
Zuoyin Tang, Jianhua He, Dashuai Pei, Kezhong Liu, Tao Gao

TL;DR
This study evaluates various large language models on their understanding of driving theory and skills, highlighting their potential and limitations for supporting connected autonomous vehicles with considerations of accuracy and cost.
Contribution
It introduces a systematic assessment of multiple LLMs on driving theory tests, providing insights into their suitability and trade-offs for autonomous vehicle applications.
Findings
GPT-4 passes the driving theory test with high accuracy.
Ernie achieves 85% accuracy, just below the passing threshold.
GPT-4's multimodal version achieves 96% accuracy on image-based questions.
Abstract
Handling long tail corner cases is a major challenge faced by autonomous vehicles (AVs). While large language models (LLMs) hold great potentials to handle the corner cases with excellent generalization and explanation capabilities and received increasing research interest on application to autonomous driving, there are still technical barriers to be tackled, such as strict model performance and huge computing resource requirements of LLMs. In this paper, we investigate a new approach of applying remote or edge LLMs to support autonomous driving. A key issue for such LLM assisted driving system is the assessment of LLMs on their understanding of driving theory and skills, ensuring they are qualified to undertake safety critical driving assistance tasks for CAVs. We design and run driving theory tests for several proprietary LLM models (OpenAI GPT models, Baidu Ernie and Ali QWen) and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Cosine Annealing · Label Smoothing · Discriminative Fine-Tuning · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Linear Warmup With Cosine Annealing · Residual Connection · Dropout · Transformer
