Multi-Agent Reinforcement Learning for Adaptive Resource Orchestration in Cloud-Native Clusters

Guanzi Yao; Heyao Liu; Linyan Dai

arXiv:2508.10253·cs.LG·August 15, 2025

Multi-Agent Reinforcement Learning for Adaptive Resource Orchestration in Cloud-Native Clusters

Guanzi Yao, Heyao Liu, Linyan Dai

PDF

TL;DR

This paper introduces a multi-agent reinforcement learning approach for adaptive resource orchestration in cloud-native clusters, improving efficiency, stability, and fairness in complex, high-dimensional scheduling environments.

Contribution

It presents a heterogeneous agent modeling mechanism and reward-shaping strategy, enhancing coordination and convergence in multi-agent resource management systems.

Findings

01

Outperforms traditional methods in resource utilization and scheduling latency

02

Improves policy convergence speed and system stability

03

Effective in high concurrency and complex dependency scenarios

Abstract

This paper addresses the challenges of high resource dynamism and scheduling complexity in cloud-native database systems. It proposes an adaptive resource orchestration method based on multi-agent reinforcement learning. The method introduces a heterogeneous role-based agent modeling mechanism. This allows different resource entities, such as compute nodes, storage nodes, and schedulers, to adopt distinct policy representations. These agents are better able to reflect diverse functional responsibilities and local environmental characteristics within the system. A reward-shaping mechanism is designed to integrate local observations with global feedback. This helps mitigate policy learning bias caused by incomplete state observations. By combining real-time local performance signals with global system value estimation, the mechanism improves coordination among agents and enhances policy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.