Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory Prediction
Cunjun Yu, Xiao Ma, Jiawei Ren, Haiyu Zhao, Shuai Yi

TL;DR
STAR introduces a novel attention-based framework for pedestrian trajectory prediction that models complex spatio-temporal interactions using Transformer mechanisms, achieving state-of-the-art results on multiple datasets.
Contribution
The paper presents STAR, a Transformer-only approach with a new graph convolution and external memory for improved trajectory prediction.
Findings
Achieves state-of-the-art performance on 5 datasets.
Effectively models complex spatio-temporal interactions.
Utilizes attention mechanisms exclusively for prediction.
Abstract
Understanding crowd motion dynamics is critical to real-world applications, e.g., surveillance systems and autonomous driving. This is challenging because it requires effectively modeling the socially aware crowd spatial interaction and complex temporal dependencies. We believe attention is the most important factor for trajectory prediction. In this paper, we present STAR, a Spatio-Temporal grAph tRansformer framework, which tackles trajectory prediction by only attention mechanisms. STAR models intra-graph crowd interaction by TGConv, a novel Transformer-based graph convolution mechanism. The inter-graph temporal dependencies are modeled by separate temporal Transformers. STAR captures complex spatio-temporal interactions by interleaving between spatial and temporal Transformers. To calibrate the temporal prediction for the long-lasting effect of disappeared pedestrians, we introduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Video Surveillance and Tracking Methods · Autonomous Vehicle Technology and Safety
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Label Smoothing · Multi-Head Attention · Convolution · Adam · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout
