Long-Range Transformers for Dynamic Spatiotemporal Forecasting

Jake Grigsby; Zhe Wang; Nam Nguyen; Yanjun Qi

arXiv:2109.12218·cs.LG·March 21, 2023·78 cites

Long-Range Transformers for Dynamic Spatiotemporal Forecasting

Jake Grigsby, Zhe Wang, Nam Nguyen, Yanjun Qi

PDF

Open Access 2 Repos

TL;DR

This paper introduces Spacetimeformer, a Long-Range Transformer model that jointly learns spatial, temporal, and value interactions in multivariate time series forecasting, outperforming existing methods on various benchmarks.

Contribution

The work proposes a novel spatiotemporal sequence formulation and a Transformer architecture that learns variable relationships directly from data without predefined graphs.

Findings

01

Achieves competitive results on traffic, electricity, and weather benchmarks.

02

Learns dynamic spatiotemporal relationships purely from data.

03

Outperforms traditional graph-based and sequence models.

Abstract

Multivariate time series forecasting focuses on predicting future values based on historical context. State-of-the-art sequence-to-sequence models rely on neural attention between timesteps, which allows for temporal learning but fails to consider distinct spatial relationships between variables. In contrast, methods based on graph neural networks explicitly model variable relationships. However, these methods often rely on predefined graphs that cannot change over time and perform separate spatial and temporal updates without establishing direct connections between each variable at every timestep. Our work addresses these problems by translating multivariate forecasting into a "spatiotemporal sequence" formulation where each Transformer input token represents the value of a single variable at a given time. Long-Range Transformers can then learn interactions between space, time, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTime Series Analysis and Forecasting · Traffic Prediction and Management Techniques · Energy Load and Power Forecasting

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Layer Normalization · Byte Pair Encoding · Dense Connections · Dropout · Absolute Position Encodings · Position-Wise Feed-Forward Layer