Dynamic resource matching in manufacturing using deep reinforcement learning
Saunak Kumar Panda, Yisha Xiang, Ruiqi Liu

TL;DR
This paper introduces a deep reinforcement learning approach with domain knowledge penalties for dynamic resource matching in manufacturing, improving policy effectiveness and convergence.
Contribution
It develops a novel domain knowledge-informed Q-learning and DDPG algorithm for large-scale manufacturing resource matching problems.
Findings
DKDDPG outperforms traditional DDPG and other RL algorithms.
The approach demonstrates higher rewards and greater efficiency in experiments.
Theoretical convergence guarantees are established for small-size problems.
Abstract
Matching plays an important role in the logical allocation of resources across a wide range of industries. The benefits of matching have been increasingly recognized in manufacturing industries. In particular, capacity sharing has received much attention recently. In this paper, we consider the problem of dynamically matching demand-capacity types of manufacturing resources. We formulate the multi-period, many-to-many manufacturing resource-matching problem as a sequential decision process. The formulated manufacturing resource-matching problem involves large state and action spaces, and it is not practical to accurately model the joint distribution of various types of demands. To address the curse of dimensionality and the difficulty of explicitly modeling the transition dynamics, we use a model-free deep reinforcement learning approach to find optimal matching policies. Moreover, to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
