A centralized reinforcement learning method for multi-agent job scheduling in Grid
Milad Moradi

TL;DR
This paper introduces CLDS, a multi-agent reinforcement learning approach for scalable, adaptive job scheduling in Grid systems that balances load efficiently while limiting communication overhead.
Contribution
It proposes a novel multi-agent reinforcement learning framework for Grid job scheduling that is scalable, adaptive, and reduces communication costs compared to existing methods.
Findings
Effectively balances load in large-scale, heavily loaded Grids.
Maintains adaptive performance across different system scales.
Reduces communication overhead in multi-agent scheduling.
Abstract
One of the main challenges in Grid systems is designing an adaptive, scalable, and model-independent method for job scheduling to achieve a desirable degree of load balancing and system efficiency. Centralized job scheduling methods have some drawbacks, such as single point of failure and lack of scalability. Moreover, decentralized methods require a coordination mechanism with limited communications. In this paper, we propose a multi-agent approach to job scheduling in Grid, named Centralized Learning Distributed Scheduling (CLDS), by utilizing the reinforcement learning framework. The CLDS is a model free approach that uses the information of jobs and their completion time to estimate the efficiency of resources. In this method, there are a learner agent and several scheduler agents that perform the task of learning and job scheduling with the use of a coordination strategy that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
