DriveCode: Domain Specific Numerical Encoding for LLM-Based Autonomous Driving

Zhiye Wang; Yanbo Jiang; Rui Zhou; Bo Zhang; Fang Zhang; Zhenhua Xu; Yaqin Zhang; Jianqiang Wang

arXiv:2603.00919·cs.CV·March 24, 2026

DriveCode: Domain Specific Numerical Encoding for LLM-Based Autonomous Driving

Zhiye Wang, Yanbo Jiang, Rui Zhou, Bo Zhang, Fang Zhang, Zhenhua Xu, Yaqin Zhang, Jianqiang Wang

PDF

Open Access 1 Datasets

TL;DR

DriveCode introduces a new numerical encoding method for LLMs in autonomous driving, enabling precise numerical reasoning and efficient processing of sensor data and control commands.

Contribution

It proposes a novel embedding-based numerical encoding technique that improves numerical reasoning and integration in LLMs for autonomous driving tasks.

Findings

01

Outperforms existing methods in trajectory prediction

02

Enhances control signal accuracy

03

Demonstrates effectiveness across multiple datasets

Abstract

Large language models (LLMs) have shown great promise for autonomous driving. However, discretizing numbers into tokens limits precise numerical reasoning, fails to reflect the positional significance of digits in the training objective, and makes it difficult to achieve both decoding efficiency and numerical precision. These limitations affect both the processing of sensor measurements and the generation of precise control commands, creating a fundamental barrier for deploying LLM-based autonomous driving systems. In this paper, we introduce DriveCode, a novel numerical encoding method that represents numbers as dedicated embeddings rather than discrete text tokens. DriveCode employs a number projector to map numbers into the language model's hidden space, enabling seamless integration with visual and textual features in a unified multimodal sequence. Evaluated on OmniDrive, DriveGPT4,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

shiftwilliam/DriveCode-data
dataset· 326 dl
326 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAutonomous Vehicle Technology and Safety · Multimodal Machine Learning Applications · Advanced Neural Network Applications