Learning to generalize Dispatching rules on the Job Shop Scheduling
Zangir Iklassov, Dmitrii Medvedev, Ruben Solozabal, Martin Takac

TL;DR
This paper proposes a reinforcement learning method with adversarial curriculum learning to improve the generalization of dispatching heuristics in job shop scheduling, significantly reducing optimality gaps across benchmark instances.
Contribution
It introduces a novel adversarial curriculum learning strategy and a size-agnostic deep learning model for better generalization in job shop scheduling.
Findings
Reduces average optimality gap from 19.35% to 10.46% on Taillard's instances.
Reduces average optimality gap from 38.43% to 18.85% on Demirkol's instances.
Significantly outperforms current state-of-the-art models.
Abstract
This paper introduces a Reinforcement Learning approach to better generalize heuristic dispatching rules on the Job-shop Scheduling Problem (JSP). Current models on the JSP do not focus on generalization, although, as we show in this work, this is key to learning better heuristics on the problem. A well-known technique to improve generalization is to learn on increasingly complex instances using Curriculum Learning (CL). However, as many works in the literature indicate, this technique might suffer from catastrophic forgetting when transferring the learned skills between different problem sizes. To address this issue, we introduce a novel Adversarial Curriculum Learning (ACL) strategy, which dynamically adjusts the difficulty level during the learning process to revisit the worst-performing instances. This work also presents a deep learning model to solve the JSP, which is equivariant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScheduling and Optimization Algorithms · Scheduling and Timetabling Solutions · Metaheuristic Optimization Algorithms Research
