DONUT: A Decoder-Only Model for Trajectory Prediction

Markus Knoche; Daan de Geus; Bastian Leibe

arXiv:2506.06854·cs.CV·August 4, 2025

DONUT: A Decoder-Only Model for Trajectory Prediction

Markus Knoche, Daan de Geus, Bastian Leibe

PDF

Open Access

TL;DR

DONUT introduces a decoder-only autoregressive model for trajectory prediction in autonomous driving, outperforming existing methods by unrolling trajectories and using an overprediction strategy for better future anticipation.

Contribution

The paper presents a novel decoder-only architecture for trajectory prediction, inspired by language models, with an overprediction strategy to enhance forecasting accuracy.

Findings

01

Outperforms encoder-decoder baselines on Argoverse 2 benchmark

02

Achieves state-of-the-art results in single-agent motion forecasting

03

Demonstrates improved iterative prediction consistency

Abstract

Predicting the motion of other agents in a scene is highly relevant for autonomous driving, as it allows a self-driving car to anticipate. Inspired by the success of decoder-only models for language modeling, we propose DONUT, a Decoder-Only Network for Unrolling Trajectories. Unlike existing encoder-decoder forecasting models, we encode historical trajectories and predict future trajectories with a single autoregressive model. This allows the model to make iterative predictions in a consistent manner, and ensures that the model is always provided with up-to-date information, thereby enhancing performance. Furthermore, inspired by multi-token prediction for language modeling, we introduce an 'overprediction' strategy that gives the model the auxiliary task of predicting trajectories at longer temporal horizons. This allows the model to better anticipate the future and further improves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAutonomous Vehicle Technology and Safety · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis