A Reinforcement Learning Approach for Scheduling Problems With Improved Generalization Through Order Swapping
Deepak Vivekanandan, Samuel Wirth, Patrick Karlbauer, Noah Klarmann

TL;DR
This paper presents a novel reinforcement learning method using PPO and order swapping mechanisms to improve generalization and solution quality for the Job Shop Scheduling Problem, outperforming traditional heuristics and metaheuristics.
Contribution
The work introduces an innovative DRL-based approach with order swapping to enhance generalization and efficiency in solving JSSP, a complex NP-hard problem.
Findings
DRL approach with PPO outperforms traditional heuristics.
Order swapping improves generalization in scheduling.
Method achieves competitive results on benchmark instances.
Abstract
The scheduling of production resources (such as associating jobs to machines) plays a vital role for the manufacturing industry not only for saving energy but also for increasing the overall efficiency. Among the different job scheduling problems, the JSSP is addressed in this work. JSSP falls into the category of NP-hard COP, in which solving the problem through exhaustive search becomes unfeasible. Simple heuristics such as FIFO, LPT and metaheuristics such as Taboo search are often adopted to solve the problem by truncating the search space. The viability of the methods becomes inefficient for large problem sizes as it is either far from the optimum or time consuming. In recent years, the research towards using DRL to solve COP has gained interest and has shown promising results in terms of solution quality and computational efficiency. In this work, we provide an novel approach to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScheduling and Optimization Algorithms · Reinforcement Learning in Robotics · Metaheuristic Optimization Algorithms Research
MethodsEntropy Regularization · Proximal Policy Optimization
