LargeST: A Benchmark Dataset for Large-Scale Traffic Forecasting
Xu Liu, Yutong Xia, Yuxuan Liang, Junfeng Hu, Yiwei Wang, Lei Bai,, Chao Huang, Zhenguang Liu, Bryan Hooi, Roger Zimmermann

TL;DR
LargeST is a comprehensive, large-scale traffic forecasting dataset covering 8,600 sensors over five years, designed to address limitations of existing datasets and facilitate advanced deep learning research in traffic prediction.
Contribution
The paper introduces LargeST, a new extensive traffic dataset with long-term coverage and detailed metadata, enabling more realistic and scalable traffic forecasting research.
Findings
Benchmarking of baseline models on LargeST
Insights into data patterns and challenges
Identification of future research opportunities
Abstract
Road traffic forecasting plays a critical role in smart city initiatives and has experienced significant advancements thanks to the power of deep learning in capturing non-linear patterns of traffic data. However, the promising results achieved on current public datasets may not be applicable to practical scenarios due to limitations within these datasets. First, the limited sizes of them may not reflect the real-world scale of traffic networks. Second, the temporal coverage of these datasets is typically short, posing hurdles in studying long-term patterns and acquiring sufficient samples for training deep models. Third, these datasets often lack adequate metadata for sensors, which compromises the reliability and interpretability of the data. To mitigate these limitations, we introduce the LargeST benchmark dataset. It encompasses a total number of 8,600 sensors in California with a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTraffic Prediction and Management Techniques · Time Series Analysis and Forecasting · Human Mobility and Location-Based Analysis
