AutoTraces: Autoregressive Trajectory Forecasting via Multimodal Large Language Models

Teng Wang; Yanting Lu; Ruize Wang

arXiv:2603.07989·cs.CV·March 10, 2026

AutoTraces: Autoregressive Trajectory Forecasting via Multimodal Large Language Models

Teng Wang, Yanting Lu, Ruize Wang

PDF

Open Access

TL;DR

AutoTraces leverages multimodal large language models with a novel trajectory tokenization scheme and chain-of-thought reasoning to improve long-term robot trajectory forecasting in human environments.

Contribution

It introduces a new trajectory tokenization method and automated reasoning mechanism, extending LLMs to physical coordinate spaces for enhanced trajectory prediction.

Findings

01

Achieves state-of-the-art accuracy in long-horizon forecasting

02

Demonstrates strong cross-scene generalization

03

Supports flexible-length trajectory predictions

Abstract

We present AutoTraces, an autoregressive vision-language-trajectory model for robot trajectory forecasting in humam-populated environments, which harnesses the inherent reasoning capabilities of large language models (LLMs) to model complex human behaviors. In contrast to prior works that rely solely on textual representations, our key innovation lies in a novel trajectory tokenization scheme, which represents waypoints with point tokens as categorical and positional markers while encoding waypoint numerical values as corresponding point embeddings, seamlessly integrated into the LLM's space through a lightweight encoder-decoder architecture. This design preserves the LLM's native autoregressive generation mechanism while extending it to physical coordinate spaces, facilitates modeling of long-term interactions in trajectory data. We further introduce an automated chain-of-thought (CoT)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAutonomous Vehicle Technology and Safety · Multimodal Machine Learning Applications · Time Series Analysis and Forecasting