Multi-level Explanation of Deep Reinforcement Learning-based Scheduling

Shaojun Zhang; Chen Wang; Albert Zomaya

arXiv:2209.09645·cs.DC·September 21, 2022·1 cites

Multi-level Explanation of Deep Reinforcement Learning-based Scheduling

Shaojun Zhang, Chen Wang, Albert Zomaya

PDF

Open Access

TL;DR

This paper introduces a multi-level explanation framework for interpreting deep reinforcement learning-based scheduling policies, making complex decision processes understandable at job and task levels to improve trust and reveal robustness issues.

Contribution

It presents a novel multi-level explanation approach that dissects DRL scheduling decisions into interpretable models aligned with operational practices.

Findings

01

Provides insights into DRL scheduler decision-making

02

Reveals robustness issues in the scheduling policy

03

Enhances trust in complex scheduling systems

Abstract

Dependency-aware job scheduling in the cluster is NP-hard. Recent work shows that Deep Reinforcement Learning (DRL) is capable of solving it. It is difficult for the administrator to understand the DRL-based policy even though it achieves remarkable performance gain. Therefore the complex model-based scheduler is not easy to gain trust in the system where simplicity is favored. In this paper, we give the multi-level explanation framework to interpret the policy of DRL-based scheduling. We dissect its decision-making process to job level and task level and approximate each level with interpretable models and rules, which align with operational practices. We show that the framework gives the system administrator insights into the state-of-the-art scheduler and reveals the robustness issue in regards to its behavior pattern.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Scheduling and Optimization Algorithms

MethodsALIGN