A Reinforcement Learning-Driven Task Scheduling Algorithm for Multi-Tenant Distributed Systems
Xiaopei Zhang, Xingang Wang, Xin Wang

TL;DR
This paper introduces a reinforcement learning-based task scheduling algorithm for multi-tenant distributed systems that dynamically optimizes latency, resource utilization, and fairness, outperforming existing methods in various scenarios.
Contribution
It presents a novel adaptive scheduling framework using PPO that models the process as a Markov decision process for real-time, multi-objective optimization in complex distributed environments.
Findings
Outperforms existing scheduling approaches in multiple metrics
Demonstrates strong stability and generalization in diverse scenarios
Effectively balances latency, resource efficiency, and fairness
Abstract
This paper addresses key challenges in task scheduling for multi-tenant distributed systems, including dynamic resource variation, heterogeneous tenant demands, and fairness assurance. An adaptive scheduling method based on reinforcement learning is proposed. By modeling the scheduling process as a Markov decision process, the study defines the state space, action space, and reward function. A scheduling policy learning framework is designed using Proximal Policy Optimization (PPO) as the core algorithm. This enables dynamic perception of complex system states and real-time decision-making. Under a multi-objective reward mechanism, the scheduler jointly optimizes task latency, resource utilization, and tenant fairness. The coordination between the policy network and the value network continuously refines the scheduling strategy. This enhances overall system performance. To validate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Cloud Computing and Resource Management · Software-Defined Networks and 5G
