PRISM: Dynamic Primitive-Based Forecasting for Large-Scale GPU Cluster Workloads

Xin Wu; Fei Teng; Xingwang Li; Bin Zheng; Qiang Duan

arXiv:2603.25378·cs.DC·March 27, 2026

PRISM: Dynamic Primitive-Based Forecasting for Large-Scale GPU Cluster Workloads

Xin Wu, Fei Teng, Xingwang Li, Bin Zheng, Qiang Duan

PDF

Open Access

TL;DR

PRISM is a novel forecasting framework that uses primitive-based decomposition and spectral refinement to accurately predict volatile, heterogeneous GPU workloads, improving resource management in AI infrastructure.

Contribution

It introduces a dual representation approach combining dictionary-driven decomposition with spectral refinement for workload forecasting.

Findings

01

Achieves state-of-the-art forecasting accuracy on large-scale GPU traces

02

Reduces burst-phase errors significantly

03

Provides a robust foundation for dynamic GPU resource management

Abstract

Accurately forecasting GPU workloads is essential for AI infrastructure, enabling efficient scheduling, resource allocation, and power management. Modern workloads are highly volatile, multiple periodicity, and heterogeneous, making them challenging for traditional predictors. We propose PRISM, a primitive-based compositional forecasting framework combining dictionary-driven temporal decomposition with adaptive spectral refinement. This dual representation extracts stable, interpretable workload signatures across diverse GPU jobs. Evaluated on large-scale production traces, PRISM achieves state-of-the-art results. It significantly reduces burst-phase errors, providing a robust, architecture-aware foundation for dynamic resource management in GPU-powered AI platforms.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCloud Computing and Resource Management · Parallel Computing and Optimization Techniques · Big Data and Digital Economy