Latent Variable Sequential Set Transformers For Joint Multi-Agent Motion Prediction
Roger Girgis, Florian Golemo, Felipe Codevilla, Martin Weiss, Jim, Aldon D'Souza, Samira Ebrahimi Kahou, Felix Heide, Christopher Pal

TL;DR
This paper introduces AutoBots, a novel encoder-decoder architecture using latent variable sequential set transformers for efficient, scene-consistent multi-agent trajectory prediction, achieving top results on key benchmarks.
Contribution
The paper proposes AutoBots, a new model combining temporal and social attention modules for joint multi-agent motion prediction, capable of single-pass inference and trained efficiently on a single GPU.
Findings
Top results on nuScenes vehicle motion prediction leaderboard
Strong performance on Argoverse vehicle prediction challenge
Effective multi-agent social prediction on TrajNet++ dataset
Abstract
Robust multi-agent trajectory prediction is essential for the safe control of robotic systems. A major challenge is to efficiently learn a representation that approximates the true joint distribution of contextual, social, and temporal information to enable planning. We propose Latent Variable Sequential Set Transformers which are encoder-decoder architectures that generate scene-consistent multi-agent trajectories. We refer to these architectures as "AutoBots". The encoder is a stack of interleaved temporal and social multi-head self-attention (MHSA) modules which alternately perform equivariant processing across the temporal and social dimensions. The decoder employs learnable seed parameters in combination with temporal and social MHSA modules allowing it to perform inference over the entire future scene in a single forward pass efficiently. AutoBots can produce either the trajectory…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Traffic Prediction and Management Techniques · Time Series Analysis and Forecasting
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Softmax · Dense Connections · Attention Is All You Need · Dropout · Residual Connection · Byte Pair Encoding · Layer Normalization
