Trajeglish: Traffic Modeling as Next-Token Prediction

Jonah Philion; Xue Bin Peng; Sanja Fidler

arXiv:2312.04535·cs.LG·April 16, 2024·1 cites

Trajeglish: Traffic Modeling as Next-Token Prediction

Jonah Philion, Xue Bin Peng, Sanja Fidler

PDF

Open Access 2 Repos 3 Reviews

TL;DR

This paper introduces Trajeglish, a traffic scenario modeling approach using token prediction with a GPT-like model, achieving state-of-the-art realism and interaction metrics in traffic simulation.

Contribution

It presents a novel discrete tokenization scheme and autoregressive modeling framework for multi-agent traffic scenarios, surpassing prior benchmarks in realism and interaction.

Findings

01

Outperforms previous models on Waymo Sim Agents Benchmark

02

Shows adaptability of learned representations to nuScenes data

03

Analyzes the impact of context length and intra-timestep interactions

Abstract

A longstanding challenge for self-driving development is simulating dynamic driving scenarios seeded from recorded driving logs. In pursuit of this functionality, we apply tools from discrete sequence modeling to model how vehicles, pedestrians and cyclists interact in driving scenarios. Using a simple data-driven tokenization scheme, we discretize trajectories to centimeter-level resolution using a small vocabulary. We then model the multi-agent sequence of discrete motion tokens with a GPT-like encoder-decoder that is autoregressive in time and takes into account intra-timestep interaction between agents. Scenarios sampled from our model exhibit state-of-the-art realism; our model tops the Waymo Sim Agents Benchmark, surpassing prior work along the realism meta metric by 3.3% and along the interaction metric by 9.9%. We ablate our modeling choices in full autonomy and partial autonomy…

Peer Reviews

Decision·ICLR 2024 poster

Reviewer 01Rating 6· marginally above the acceptance thresholdConfidence 4

Strengths

The key strengths of this work lie in its conceptual and architectural simplicity in comparison to existing methods. The idea is well-motivated and the presentation is clear. Besides this, the paper provides a detailed experimental analysis on different aspects of the proposed design space.

Weaknesses

1. The benchmarking in Table 1 follows a much simpler setting with fewer max agents (24 vs. 128) and a shorter time horizon (6 seconds vs. 8 seconds) than prior work on WOMD [1,2,3]. 2. As a result of this simpler benchmark and missing comparisons to any prior architecture, this paper does not address the key question of whether the proposed method is competitive to the current state-of-the-art despite its simplicity. At a glance, it seems to be much worse, with a minADE >3m in comparison to th

Reviewer 02Rating 6· marginally above the acceptance thresholdConfidence 4

Strengths

* Strong tokenizer k-disks outperforming kMeans baselines with low discretization errors and convincing ablation study * Autoregressive and casual rollouts * Experiments demonstrating the benefits of intra-timestep dependence of agents * Experiments demonstrating the transfer to nuScenes

Weaknesses

* Missing WOMD baseline results from other models * Similar contributions as the recently published “MotionLM: Multi-Agent Motion Forecasting as Language Modeling” (https://arxiv.org/pdf/2309.16534.pdf)

Reviewer 03Rating 6· marginally above the acceptance thresholdConfidence 3

Strengths

### 1.The idea of tokenization using a small vocabulary is moderately novel. ### 2.The visualization and illustration are well made and help the readers to understand the paper.

Weaknesses

## Major: ### 1.motivation of using tokenization (compared with using the actual values as in most of existing work in Appendix B) is not very clear. ### 2.the experimental results are not very impressive (1) Improvements in Table 1 seem quite small. Can you show standard deviations for the results? (2) only evaluate on open-loop simulation but not on close-loop simulation (3) the baseline details are not given (e.g., “The “marginal” baseline is an equally important baseline designed to mi

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAutonomous Vehicle Technology and Safety · Traffic Prediction and Management Techniques · Time Series Analysis and Forecasting