Chance Constrained Policy Optimization for Process Control and Optimization
Panagiotis Petsagkourakis, Ilya Orson Sandoval, Eric Bradford,, Federico Galvanin, Dongda Zhang, Ehecatl Antonio del Rio-Chanona

TL;DR
This paper introduces a chance constrained policy optimization (CCPO) method that ensures safety-critical constraints are satisfied with high probability in process control, addressing uncertainties and plant-model mismatch without complex inner optimization loops.
Contribution
The paper presents a novel CCPO algorithm with self-tuned constraint backoffs, enabling reliable satisfaction of joint chance constraints in reinforcement learning for industrial processes.
Findings
CCPO guarantees high-probability constraint satisfaction.
Backoffs are optimized via Bayesian methods.
Method is validated through case studies.
Abstract
Chemical process optimization and control are affected by 1) plant-model mismatch, 2) process disturbances, and 3) constraints for safe operation. Reinforcement learning by policy optimization would be a natural way to solve this due to its ability to address stochasticity, plant-model mismatch, and directly account for the effect of future uncertainty and its feedback in a proper closed-loop manner; all without the need of an inner optimization loop. One of the main reasons why reinforcement learning has not been considered for industrial processes (or almost any engineering application) is that it lacks a framework to deal with safety critical constraints. Present algorithms for policy optimization use difficult-to-tune penalty parameters, fail to reliably satisfy state constraints or present guarantees only in expectation. We propose a chance constrained policy optimization (CCPO)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
