BLAST: Balanced Sampling Time Series Corpus for Universal Forecasting Models
Zezhi Shao, Yujie Li, Fei Wang, Chengqing Yu, Yisong Fu, Tangwen Qian, Bin Xu, Boyu Diao, Yongjun Xu, Xueqi Cheng

TL;DR
BLAST is a new balanced sampling corpus for time series forecasting that improves model generalization and efficiency by enhancing data diversity through innovative sampling and clustering techniques.
Contribution
We introduce BLAST, a large-scale, diversity-enhanced pre-training corpus for universal time series forecasting models, utilizing balanced sampling, clustering, and mixup methods.
Findings
Models trained on BLAST outperform existing datasets in forecasting accuracy.
BLAST reduces training resources needed for state-of-the-art performance.
Data diversity is crucial for effective universal time series forecasting.
Abstract
The advent of universal time series forecasting models has revolutionized zero-shot forecasting across diverse domains, yet the critical role of data diversity in training these models remains underexplored. Existing large-scale time series datasets often suffer from inherent biases and imbalanced distributions, leading to suboptimal model performance and generalization. To address this gap, we introduce BLAST, a novel pre-training corpus designed to enhance data diversity through a balanced sampling strategy. First, BLAST incorporates 321 billion observations from publicly available datasets and employs a comprehensive suite of statistical metrics to characterize time series patterns. Then, to facilitate pattern-oriented sampling, the data is implicitly clustered using grid-based partitioning. Furthermore, by integrating grid sampling and grid mixup techniques, BLAST ensures a balanced…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMixup
