STGformer: Efficient Spatiotemporal Graph Transformer for Traffic Forecasting
Hongjun Wang, Jiyuan Chen, Tong Pan, Zheng Dong, Lingyu Zhang, Renhe, Jiang, and Xuan Song

TL;DR
STGformer is a novel spatiotemporal graph transformer that significantly improves traffic forecasting efficiency by reducing computational costs and memory usage while maintaining high accuracy, enabling real-world large-scale applications.
Contribution
The paper introduces STGformer, a new architecture that captures high-order spatiotemporal interactions in a single layer, achieving 100x speedup and 99.8% memory reduction over previous models.
Findings
Achieves 100x faster inference than STAEformer.
Reduces GPU memory usage by 99.8%.
Outperforms state-of-the-art methods on LargeST benchmark.
Abstract
Traffic forecasting is a cornerstone of smart city management, enabling efficient resource allocation and transportation planning. Deep learning, with its ability to capture complex nonlinear patterns in spatiotemporal (ST) data, has emerged as a powerful tool for traffic forecasting. While graph neural networks (GCNs) and transformer-based models have shown promise, their computational demands often hinder their application to real-world road networks, particularly those with large-scale spatiotemporal interactions. To address these challenges, we propose a novel spatiotemporal graph transformer (STGformer) architecture. STGformer effectively balances the strengths of GCNs and Transformers, enabling efficient modeling of both global and local traffic patterns while maintaining a manageable computational footprint. Unlike traditional approaches that require multiple attention layers,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTraffic Prediction and Management Techniques · Time Series Analysis and Forecasting · Advanced Computational Techniques and Applications
MethodsAttention Is All You Need · Linear Layer · Laplacian EigenMap · Multi-Head Attention · Layer Normalization · Dense Connections · Adam · Laplacian Positional Encodings · Residual Connection · Position-Wise Feed-Forward Layer
