Linear Attention is Enough in Spatial-Temporal Forecasting

Xinyu Ning

arXiv:2408.09158·cs.LG·September 16, 2024

Linear Attention is Enough in Spatial-Temporal Forecasting

Xinyu Ning

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel Transformer-based approach for spatial-temporal traffic forecasting, using independent tokens for nodes over time and a Nyström-based variant for linear complexity, achieving state-of-the-art results.

Contribution

Proposes STformer and NSTformer models that effectively capture spatial-temporal patterns with improved efficiency and accuracy in traffic forecasting.

Findings

01

Achieves state-of-the-art performance on traffic datasets.

02

NSTformer offers linear complexity with competitive accuracy.

03

Models outperform existing methods in capturing dynamic road network topology.

Abstract

As the most representative scenario of spatial-temporal forecasting tasks, the traffic forecasting task attracted numerous attention from machine learning community due to its intricate correlation both in space and time dimension. Existing methods often treat road networks over time as spatial-temporal graphs, addressing spatial and temporal representations independently. However, these approaches struggle to capture the dynamic topology of road networks, encounter issues with message passing mechanisms and over-smoothing, and face challenges in learning spatial and temporal relationships separately. To address these limitations, we propose treating nodes in road networks at different time steps as independent spatial-temporal tokens and feeding them into a vanilla Transformer to learn complex spatial-temporal patterns, design \textbf{STformer} achieving SOTA. Given its quadratic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xinyuning/stformer-and-nstformer
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRemote Sensing and LiDAR Applications

MethodsLinear Layer · Residual Connection · Layer Normalization · Multi-Head Attention · Position-Wise Feed-Forward Layer · Adam · Attention Is All You Need · Byte Pair Encoding · Absolute Position Encodings · Softmax