Deployment-Efficient Short-Term Load Forecasting in AI Data Centers via Sequence-to-Point Knowledge Distillation
Lei Wang, Jiahao Chen, Fanping Sui, Ying Zhang, and Di Shi

TL;DR
This paper introduces a knowledge distillation framework that trains a high-capacity model to improve short-term load forecasting in AI data centers, then transfers this knowledge to a lightweight model for efficient deployment, achieving high accuracy with significantly reduced resource requirements.
Contribution
It presents a novel sequence-to-point distillation method that enhances lightweight models' accuracy for load forecasting in AI data centers, balancing deployment efficiency and predictive performance.
Findings
The student model outperforms recent deep learning baselines in forecasting accuracy.
The approach reduces model size and memory by over 10 times.
Case studies demonstrate improved deployment efficiency without sacrificing accuracy.
Abstract
Accurately forecasting the bursty and non-stationary power demand of AI data centers has become increasingly important, as abrupt workload-driven variations at the GPU-node level can affect real-time operational efficiency, power management, and grid-data center coordination. However, high-capacity forecasting models are often difficult to deploy at scale because of their memory and latency requirements, while lightweight predictors may fail to capture short-horizon temporal dynamics. To address this accuracy-deployment tradeoff, this paper proposes a deployment-efficient knowledge distillation framework for short-term load forecasting in AI data centers. The proposed framework first trains a high-capacity sequence teacher model for multi-step load trajectory prediction, where residual learning is used to improve robustness under non-stationary operating conditions. A lightweight…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
