Green Simulation Assisted Reinforcement Learning with Model Risk for Biomanufacturing Learning and Control
Hua Zheng, Wei Xie, Mingbin Ben Feng

TL;DR
This paper introduces a green simulation-assisted reinforcement learning framework for biomanufacturing that effectively manages model risk and enhances data efficiency in process control.
Contribution
It proposes a novel model-based reinforcement learning method that quantifies process model risk and reuses simulation data to improve policy learning in biomanufacturing.
Findings
Demonstrates improved policy learning efficiency
Effectively manages model risk in stochastic systems
Shows promising results in numerical studies
Abstract
Biopharmaceutical manufacturing faces critical challenges, including complexity, high variability, lengthy lead time, and limited historical data and knowledge of the underlying system stochastic process. To address these challenges, we propose a green simulation assisted model-based reinforcement learning to support process online learning and guide dynamic decision making. Basically, the process model risk is quantified by the posterior distribution. At any given policy, we predict the expected system response with prediction risk accounting for both inherent stochastic uncertainty and model risk. Then, we propose green simulation assisted reinforcement learning and derive the mixture proposal distribution of decision process and likelihood ratio based metamodel for the policy gradient, which can selectively reuse process trajectory outputs collected from previous experiments to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsViral Infectious Diseases and Gene Expression in Insects · Advanced Multi-Objective Optimization Algorithms · Reinforcement Learning in Robotics
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
