Frequency-Aligned Knowledge Distillation for Lightweight Spatiotemporal Forecasting

Yuqi Li; Chuanguang Yang; Hansheng Zeng; Zeyu Dong; Zhulin An; Yongjun Xu; Yingli Tian; Hao Wu

arXiv:2507.02939·cs.LG·July 22, 2025

Frequency-Aligned Knowledge Distillation for Lightweight Spatiotemporal Forecasting

Yuqi Li, Chuanguang Yang, Hansheng Zeng, Zeyu Dong, Zhulin An, Yongjun Xu, Yingli Tian, Hao Wu

PDF

TL;DR

This paper introduces SDKD, a spectral frequency-aligned knowledge distillation method that enhances lightweight spatiotemporal forecasting models by transferring multi-scale spectral features from complex teachers, improving accuracy and efficiency.

Contribution

The paper proposes a novel frequency-aligned knowledge distillation strategy that effectively transfers multi-scale spectral features from teacher to student models for spatiotemporal forecasting.

Findings

01

Achieves up to 81.3% reduction in MSE on Navier-Stokes dataset.

02

Significantly improves forecasting accuracy by capturing both high-frequency and low-frequency components.

03

Reduces training time and memory consumption compared to traditional methods.

Abstract

Spatiotemporal forecasting tasks, such as traffic flow, combustion dynamics, and weather forecasting, often require complex models that suffer from low training efficiency and high memory consumption. This paper proposes a lightweight framework, Spectral Decoupled Knowledge Distillation (termed SDKD), which transfers the multi-scale spatiotemporal representations from a complex teacher model to a more efficient lightweight student network. The teacher model follows an encoder-latent evolution-decoder architecture, where its latent evolution module decouples high-frequency details and low-frequency trends using convolution and Transformer (global low-frequency modeler). However, the multi-layer convolution and deconvolution structures result in slow training and high memory usage. To address these issues, we propose a frequency-aligned knowledge distillation strategy, which extracts…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.