Beyond Training: Optimizing Reinforcement Learning Based Job Shop   Scheduling Through Adaptive Action Sampling

Constantin Waubert de Puiseau; Christian D\"orpelkus; Jannik Peters,; Hasan Tercan; Tobias Meisen

arXiv:2406.07325·cs.AI·June 12, 2024

Beyond Training: Optimizing Reinforcement Learning Based Job Shop Scheduling Through Adaptive Action Sampling

Constantin Waubert de Puiseau, Christian D\"orpelkus, Jannik Peters,, Hasan Tercan, Tobias Meisen

PDF

Open Access

TL;DR

This paper introduces a method called δ-sampling to adaptively balance exploration and exploitation in trained reinforcement learning agents for job shop scheduling, improving solution quality within computational constraints.

Contribution

It proposes a novel inference technique for DRL agents that adjusts their behavior based on computational budget, enhancing solution diversity and quality.

Findings

01

δ-sampling improves search space coverage

02

Optimal parameterization enhances solution quality

03

Method outperforms standard inference approaches

Abstract

Learned construction heuristics for scheduling problems have become increasingly competitive with established solvers and heuristics in recent years. In particular, significant improvements have been observed in solution approaches using deep reinforcement learning (DRL). While much attention has been paid to the design of network architectures and training algorithms to achieve state-of-the-art results, little research has investigated the optimal use of trained DRL agents during inference. Our work is based on the hypothesis that, similar to search algorithms, the utilization of trained DRL agents should be dependent on the acceptable computational budget. We propose a simple yet effective parameterization, called $δ$ -sampling that manipulates the trained action vector to bias agent behavior towards exploration or exploitation during solution construction. By following this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScheduling and Optimization Algorithms · Smart Grid Energy Management