Bilevel Learning Model Towards Industrial Scheduling
Longkang Li, Hui-Ling Zhen, Mingxuan Yuan, Jiawen Lu, XialiangTong,, Jia Zeng, Jun Wang, Dirk Schnieders

TL;DR
This paper introduces a novel bilevel deep reinforcement learning scheduler, BDS, for large-scale industrial job scheduling, significantly improving efficiency and solution quality over traditional heuristics and deep networks.
Contribution
The paper develops a bilevel deep reinforcement learning approach combining DDQN and GPN, with theoretical convergence guarantees, for large-scale industrial scheduling.
Findings
BDS outperforms heuristics and deep networks in industrial scenarios.
BDS reduces makespan by over 22% to 28% on large datasets.
Computational time is less than 200 seconds for large-scale problems.
Abstract
Automatic industrial scheduling, aiming at optimizing the sequence of jobs over limited resources, is widely needed in manufacturing industries. However, existing scheduling systems heavily rely on heuristic algorithms, which either generate ineffective solutions or compute inefficiently when job scale increases. Thus, it is of great importance to develop new large-scale algorithms that are not only efficient and effective, but also capable of satisfying complex constraints in practice. In this paper, we propose a Bilevel Deep reinforcement learning Scheduler, \textit{BDS}, in which the higher level is responsible for exploring an initial global sequence, whereas the lower level is aiming at exploitation for partial sequence refinements, and the two levels are connected by a sliding-window sampling mechanism. In the implementation, a Double Deep Q Network (DDQN) is used in the upper…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScheduling and Optimization Algorithms · Assembly Line Balancing Optimization · Advanced Manufacturing and Logistics Optimization
MethodsSoftmax · Sigmoid Activation · Long Short-Term Memory · Tanh Activation · [LivE@PeRson]How do I talk to a real person at Expedia? · Pointer Network
