Learning dynamic and hierarchical traffic spatiotemporal features with Transformer
Haoyang Yan, Xiaolei Ma

TL;DR
This paper introduces Traffic Transformer, a novel deep learning model that leverages hierarchical multi-head attention mechanisms to dynamically capture complex spatiotemporal dependencies in traffic data, improving long-term forecasting accuracy.
Contribution
It adapts Transformer architecture to traffic forecasting, overcoming fixed adjacency limitations and enabling dynamic, hierarchical spatiotemporal feature extraction for better predictions.
Findings
Outperforms state-of-the-art models on public datasets
Effectively captures dynamic spatial dependencies
Provides interpretability through attention weight analysis
Abstract
Traffic forecasting is an indispensable part of Intelligent transportation systems (ITS), and long-term network-wide accurate traffic speed forecasting is one of the most challenging tasks. Recently, deep learning methods have become popular in this domain. As traffic data are physically associated with road networks, most proposed models treat it as a spatiotemporal graph modeling problem and use Graph Convolution Network (GCN) based methods. These GCN-based models highly depend on a predefined and fixed adjacent matrix to reflect the spatial dependency. However, the predefined fixed adjacent matrix is limited in reflecting the actual dependence of traffic flow. This paper proposes a novel model, Traffic Transformer, for spatial-temporal graph modeling and long-term traffic forecasting to overcome these limitations. Transformer is the most popular framework in Natural Language…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTraffic Prediction and Management Techniques · Data Management and Algorithms · Transportation Planning and Optimization
MethodsAttention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Dense Connections · Softmax · Layer Normalization · Residual Connection · Adam · Dropout
