Mitigating Data Scarcity in Time Series Analysis: A Foundation Model   with Series-Symbol Data Generation

Wenxuan Wang; Kai Wu; Yujian Betterest Li; Dan Wang; Xiaoyu Zhang,; Jing Liu

arXiv:2502.15466·cs.LG·February 24, 2025

Mitigating Data Scarcity in Time Series Analysis: A Foundation Model with Series-Symbol Data Generation

Wenxuan Wang, Kai Wu, Yujian Betterest Li, Dan Wang, Xiaoyu Zhang,, Jing Liu

PDF

TL;DR

This paper introduces SymTime, a foundation model for time series analysis that uses a novel series-symbol data generation method to overcome data scarcity, achieving competitive results across multiple tasks.

Contribution

The paper presents a dual-modality data generation mechanism and a pre-trained foundation model, SymTime, for improved time series analysis under data scarcity conditions.

Findings

01

SymTime performs competitively on five TSA tasks.

02

The series-symbol data generation enhances data diversity and quality.

03

Pretraining on generated data rivals real-world dataset pretraining.

Abstract

Foundation models for time series analysis (TSA) have attracted significant attention. However, challenges such as data scarcity and data imbalance continue to hinder their development. To address this, we consider modeling complex systems through symbolic expressions that serve as semantic descriptors of time series. Building on this concept, we introduce a series-symbol (S2) dual-modulity data generation mechanism, enabling the unrestricted creation of high-quality time series data paired with corresponding symbolic representations. Leveraging the S2 dataset, we develop SymTime, a pre-trained foundation model for TSA. SymTime demonstrates competitive performance across five major TSA tasks when fine-tuned with downstream task, rivaling foundation models pre-trained on real-world datasets. This approach underscores the potential of dual-modality data generation and pretraining…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.