TL;DR
ProRL is an interpretable reinforcement learning framework that learns human-readable scheduling programs, combining domain-specific language, local search, and Bayesian optimization to outperform existing heuristics and DRL methods.
Contribution
It introduces a novel programmatic RL approach with a domain-specific language and optimization techniques for interpretable and efficient scheduling policies.
Findings
ProRL achieves strong performance on benchmark scheduling problems.
ProRL performs well with limited training data (only 100 episodes).
ProRL outperforms existing heuristics and DRL baselines.
Abstract
Deep reinforcement learning (DRL) has recently emerged as a promising approach to solve combinatorial optimization problems such as job shop scheduling. However, the policies learned by DRL are typically represented by deep neural networks (DNNs), whose opaque neural architectures and non-interpretable policy decisions can lead to critical trust and usability concerns for human decision makers. In addition, the computational requirements of DNNs can further hinder practical deployment in resource constrained environments. In this work, we propose ProRL, a novel interpretable programmatic reinforcement learning framework that achieves high-performance scheduling with human-readable and editable programmatic policies (i.e., programs). We first introduce a domain-specific language for scheduling (DSL-S) to represent scheduling strategies as structured programs. ProRL then explores the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
