STEP: Scientific Time-Series Encoder Pretraining via Cross-Domain Distillation
Chen Zhang, Liwei Liu, Jun Tao, Xiaoyu Yang, Xuenan Xu, Kai Chen, Bowen Zhou, Wen Wu, Chao Zhang

TL;DR
This paper introduces STEP, a pretraining framework that leverages multiple foundation models through cross-domain distillation to create a unified, transferable encoder for scientific time series analysis.
Contribution
The paper proposes a novel pretraining method, STEP, that effectively integrates knowledge from various foundation models to improve scientific time series representation learning.
Findings
STEP outperforms existing methods on seven scientific time series tasks.
Cross-domain distillation enhances the transferability of learned representations.
Adaptive patching and statistics compensation improve handling of extreme-length sequences and diverse scales.
Abstract
Scientific time series are central to scientific AI but are typically sparse, highly heterogeneous, and limited in scale, making unified representation learning particularly challenging. Meanwhile, foundation models pretrained on relevant time series domains such as audio, general time series, and brain signals contain rich knowledge, but their applicability to scientific signals remains underexplored. In this paper, we investigate the transferability and complementarity of foundation models from relevant time series domains, and study how to effectively leverage them to build a unified encoder for scientific time series. We first systematically evaluate relevant foundation models, showing the effectiveness of knowledge transfer to scientific tasks and their complementary strengths. Based on this observation, we propose STEP, a Scientific Time Series Encoder Pretraining framework via…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Machine Learning in Healthcare · Generative Adversarial Networks and Image Synthesis
