TL;DR
This paper introduces a model-free reinforcement learning algorithm for optimal control of episodic manufacturing processes, eliminating the need for explicit process modeling and effectively handling stochasticity and partial observability.
Contribution
It presents a novel Q-learning-based method for adaptive optimal control that learns online without pre-existing process models, suitable for complex manufacturing processes.
Findings
Successfully applied to a simulated deep drawing process
Handles stochastic variations and partial observability
Outperforms traditional model-based control methods
Abstract
A self-learning optimal control algorithm for episodic fixed-horizon manufacturing processes with time-discrete control actions is proposed and evaluated on a simulated deep drawing process. The control model is built during consecutive process executions under optimal control via reinforcement learning, using the measured product quality as reward after each process execution. Prior model formulation, which is required by state-of-the-art algorithms from model predictive control and approximate dynamic programming, is therefore obsolete. This avoids several difficulties namely in system identification, accurate modelling, and runtime complexity, that arise when dealing with processes subject to nonlinear dynamics and stochastic influences. Instead of using pre-created process and observation models, value function-based reinforcement learning algorithms build functions of expected…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
