Multi-Scale Finetuning for Encoder-based Time Series Foundation Models

Zhongzheng Qiao; Chenghao Liu; Yiming Zhang; Ming Jin; Quang Pham; Qingsong Wen; P.N. Suganthan; Xudong Jiang; Savitha Ramasamy

arXiv:2506.14087·cs.LG·October 13, 2025

Multi-Scale Finetuning for Encoder-based Time Series Foundation Models

Zhongzheng Qiao, Chenghao Liu, Yiming Zhang, Ming Jin, Quang Pham, Qingsong Wen, P.N. Suganthan, Xudong Jiang, Savitha Ramasamy

PDF

Open Access

TL;DR

This paper introduces Multiscale Finetuning (MSFT), a novel framework for encoder-based time series foundation models that explicitly models multiple scales during finetuning, significantly improving forecasting performance.

Contribution

The paper proposes MSFT, a general multiscale finetuning method that enhances encoder-based TSFMs by explicitly incorporating multi-scale modeling, outperforming existing approaches.

Findings

01

MSFT outperforms naive finetuning methods.

02

MSFT surpasses state-of-the-art deep learning models.

03

Experimental validation on three backbones confirms effectiveness.

Abstract

Time series foundation models (TSFMs) demonstrate impressive zero-shot performance for time series forecasting. However, an important yet underexplored challenge is how to effectively finetune TSFMs on specific downstream tasks. While naive finetuning can yield performance gains, we argue that it falls short of fully leveraging TSFMs' capabilities, often resulting in overfitting and suboptimal performance. Given the diverse temporal patterns across sampling scales and the inherent multi-scale forecasting capabilities of TSFMs, we adopt a causal perspective to analyze finetuning process, through which we highlight the critical importance of explicitly modeling multiple scales and reveal the shortcomings of naive approaches. Focusing on encoder-based TSFMs, we propose Multiscale finetuning (MSFT), a simple yet general framework that explicitly integrates multi-scale modeling into the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTime Series Analysis and Forecasting · Neural Networks and Applications

MethodsADaptive gradient method with the OPTimal convergence rate