Multi-Scale Finetuning for Encoder-based Time Series Foundation Models
Zhongzheng Qiao, Chenghao Liu, Yiming Zhang, Ming Jin, Quang Pham, Qingsong Wen, P.N. Suganthan, Xudong Jiang, Savitha Ramasamy

TL;DR
This paper introduces Multiscale Finetuning (MSFT), a novel framework for encoder-based time series foundation models that explicitly models multiple scales during finetuning, significantly improving forecasting performance.
Contribution
The paper proposes MSFT, a general multiscale finetuning method that enhances encoder-based TSFMs by explicitly incorporating multi-scale modeling, outperforming existing approaches.
Findings
MSFT outperforms naive finetuning methods.
MSFT surpasses state-of-the-art deep learning models.
Experimental validation on three backbones confirms effectiveness.
Abstract
Time series foundation models (TSFMs) demonstrate impressive zero-shot performance for time series forecasting. However, an important yet underexplored challenge is how to effectively finetune TSFMs on specific downstream tasks. While naive finetuning can yield performance gains, we argue that it falls short of fully leveraging TSFMs' capabilities, often resulting in overfitting and suboptimal performance. Given the diverse temporal patterns across sampling scales and the inherent multi-scale forecasting capabilities of TSFMs, we adopt a causal perspective to analyze finetuning process, through which we highlight the critical importance of explicitly modeling multiple scales and reveal the shortcomings of naive approaches. Focusing on encoder-based TSFMs, we propose Multiscale finetuning (MSFT), a simple yet general framework that explicitly integrates multi-scale modeling into the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Neural Networks and Applications
MethodsADaptive gradient method with the OPTimal convergence rate
