Chance Constrained Policy Optimization for Process Control and   Optimization

Panagiotis Petsagkourakis; Ilya Orson Sandoval; Eric Bradford,; Federico Galvanin; Dongda Zhang; Ehecatl Antonio del Rio-Chanona

arXiv:2008.00030·eess.SY·December 18, 2020

Chance Constrained Policy Optimization for Process Control and Optimization

Panagiotis Petsagkourakis, Ilya Orson Sandoval, Eric Bradford,, Federico Galvanin, Dongda Zhang, Ehecatl Antonio del Rio-Chanona

PDF

TL;DR

This paper introduces a chance constrained policy optimization (CCPO) method that ensures safety-critical constraints are satisfied with high probability in process control, addressing uncertainties and plant-model mismatch without complex inner optimization loops.

Contribution

The paper presents a novel CCPO algorithm with self-tuned constraint backoffs, enabling reliable satisfaction of joint chance constraints in reinforcement learning for industrial processes.

Findings

01

CCPO guarantees high-probability constraint satisfaction.

02

Backoffs are optimized via Bayesian methods.

03

Method is validated through case studies.

Abstract

Chemical process optimization and control are affected by 1) plant-model mismatch, 2) process disturbances, and 3) constraints for safe operation. Reinforcement learning by policy optimization would be a natural way to solve this due to its ability to address stochasticity, plant-model mismatch, and directly account for the effect of future uncertainty and its feedback in a proper closed-loop manner; all without the need of an inner optimization loop. One of the main reasons why reinforcement learning has not been considered for industrial processes (or almost any engineering application) is that it lacks a framework to deal with safety critical constraints. Present algorithms for policy optimization use difficult-to-tune penalty parameters, fail to reliably satisfy state constraints or present guarantees only in expectation. We propose a chance constrained policy optimization (CCPO)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.