Decentralized Task Scheduling in Distributed Systems: A Deep Reinforcement Learning Approach
Daniel Benniah John

TL;DR
This paper introduces a decentralized deep reinforcement learning framework for task scheduling in heterogeneous distributed systems, achieving significant improvements in efficiency, energy use, and SLA satisfaction with a lightweight implementation suitable for edge devices.
Contribution
It presents a novel multi-agent DRL approach formulated as a Dec-POMDP, with a lightweight NumPy-based architecture for scalable, decentralized task scheduling.
Findings
15.6% reduction in task completion time
15.2% energy efficiency improvement
82.3% SLA satisfaction rate
Abstract
Efficient task scheduling in large-scale distributed systems presents significant challenges due to dynamic workloads, heterogeneous resources, and competing quality-of-service requirements. Traditional centralized approaches face scalability limitations and single points of failure, while classical heuristics lack adaptability to changing conditions. This paper proposes a decentralized multi-agent deep reinforcement learning (DRL-MADRL) framework for task scheduling in heterogeneous distributed systems. We formulate the problem as a Decentralized Partially Observable Markov Decision Process (Dec-POMDP) and develop a lightweight actor-critic architecture implemented using only NumPy, enabling deployment on resource-constrained edge devices without heavyweight machine learning frameworks. Using workload characteristics derived from the publicly available Google Cluster Trace dataset, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Cloud Computing and Resource Management · IoT and Edge/Fog Computing
