Frozen LLMs as Map-Aware Spatio-Temporal Reasoners for Vehicle Trajectory Prediction

Yanjiao Liu; Jiawei Liu; Xun Gong; Zifei Nie

arXiv:2604.21479·cs.CV·May 1, 2026

Frozen LLMs as Map-Aware Spatio-Temporal Reasoners for Vehicle Trajectory Prediction

Yanjiao Liu, Jiawei Liu, Xun Gong, Zifei Nie

PDF

TL;DR

This paper presents a framework that uses frozen large language models as reasoning engines for vehicle trajectory prediction, integrating scene features and map semantics to evaluate their understanding of traffic dynamics.

Contribution

It introduces a novel approach to leverage frozen LLMs with minimal adaptation for understanding traffic scenes and predicting vehicle trajectories in autonomous driving.

Findings

01

LLMs can effectively incorporate map semantics for trajectory prediction.

02

The framework demonstrates strong generalizability across different LLM architectures.

03

Quantitative analysis shows the impact of multi-modal information on prediction accuracy.

Abstract

Large language models (LLMs) have recently demonstrated strong reasoning capabilities and attracted increasing research attention in the field of autonomous driving (AD). However, safe application of LLMs on AD perception and prediction still requires a thorough understanding of both the dynamic traffic agents and the static road infrastructure. To this end, this study introduces a framework to evaluate the capability of LLMs in understanding the behaviors of dynamic traffic agents and the topology of road networks. The framework leverages frozen LLMs as the reasoning engine, employing a traffic encoder to extract spatial-level scene features from observed trajectories of agents, while a lightweight Convolutional Neural Network (CNN) encodes the local high-definition (HD) maps. To assess the intrinsic reasoning ability of LLMs, the extracted scene features are then transformed into…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.