Solving Large-Scale Granular Resource Allocation Problems Efficiently with POP
Deepak Narayanan, Fiodar Kazhamiaka, Firas Abuzaid, Peter Kraft,, Akshay Agrawal, Srikanth Kandula, Stephen Boyd, Matei Zaharia

TL;DR
This paper introduces POP, a novel partitioned optimization approach for large-scale granular resource allocation problems, achieving near-optimal solutions efficiently by dividing and conquering the problem space.
Contribution
POP is a new method that reuses the original optimization formulation by randomly partitioning the problem, leading to better solutions than heuristics for large-scale systems.
Findings
POP achieves allocations within 1.5% of optimal
POP significantly reduces runtime compared to existing methods
Effective for cluster scheduling, traffic engineering, and load balancing
Abstract
Resource allocation problems in many computer systems can be formulated as mathematical optimization problems. However, finding exact solutions to these problems using off-the-shelf solvers is often intractable for large problem sizes with tight SLAs, leading system designers to rely on cheap, heuristic algorithms. We observe, however, that many allocation problems are granular: they consist of a large number of clients and resources, each client requests a small fraction of the total number of resources, and clients can interchangeably use different resources. For these problems, we propose an alternative approach that reuses the original optimization problem formulation and leads to better allocations than domain-specific heuristics. Our technique, Partitioned Optimization Problems (POP), randomly splits the problem into smaller problems (with a subset of the clients and resources in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Optimization and Search Problems · Distributed and Parallel Computing Systems
