Optimizing Memory Allocation in Distributed Clusters with Predictive Modeling
Jonathan Bader, Edgar Blumenthal, Marten Eckardt, Justus Krebs, Joel Witzke, Xemena Wysokinska, Haci Ismail Aslan, Odej Kao

TL;DR
This paper presents a predictive memory allocation method using ensemble regression models to optimize resource use in distributed systems, reducing underallocation and overallocation errors.
Contribution
It introduces a novel ensemble-based predictive approach with safety factors to improve memory allocation efficiency in distributed clusters.
Findings
Reduced underallocated jobs from 4.17% to 2.89%.
Lowered average overallocation from 148% to 44.51%.
Explored the Pareto frontier between underallocation and overallocation.
Abstract
In modern distributed systems, efficient resource allocation is a vital aspect to maintain scalability, reduce operational costs, and ensure fast execution even across heterogeneous workloads. Predictive models for resource usage are essential tools for optimizing allocation and preventing system bottlenecks. Predictive memory allocation has asymmetric costs as a key challenge: underallocation causes failures while overallocation wastes memory. We propose a regression method based on a LightGBM and XGBoost ensemble trained to predict high conditional quantiles. To further account for the high cost of underallocations we add a multiplicative safety factor. With our method we are able to reduce the number of under-allocated jobs from 4.17% to 2.89% and average overallocation from 148% to 44.51% on a real-world dataset of build jobs provided by SAP. We further explore the pareto frontier…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
