Intersection of Reinforcement Learning and Bayesian Optimization for Intelligent Control of Industrial Processes: A Safe MPC-based DPG using Multi-Objective BO

Hossein Nejatbakhsh Esfahani; Javad Mohammadpour Velni

arXiv:2507.09864·eess.SY·July 15, 2025

Intersection of Reinforcement Learning and Bayesian Optimization for Intelligent Control of Industrial Processes: A Safe MPC-based DPG using Multi-Objective BO

Hossein Nejatbakhsh Esfahani, Javad Mohammadpour Velni

PDF

Open Access

TL;DR

This paper introduces a safe and efficient control framework combining MPC-RL with Multi-Objective Bayesian Optimization, improving convergence, safety, and performance in industrial process control.

Contribution

It presents a novel MPC-RL-MOBO framework that integrates Bayesian optimization with RL and MPC for safer, faster, and more effective control policy tuning.

Findings

01

Enhanced sample efficiency in control learning

02

Improved safety during online adaptation

03

Achieved stable high-performance control

Abstract

Model Predictive Control (MPC)-based Reinforcement Learning (RL) offers a structured and interpretable alternative to Deep Neural Network (DNN)-based RL methods, with lower computational complexity and greater transparency. However, standard MPC-RL approaches often suffer from slow convergence, suboptimal policy learning due to limited parameterization, and safety issues during online adaptation. To address these challenges, we propose a novel framework that integrates MPC-RL with Multi-Objective Bayesian Optimization (MOBO). The proposed MPC-RL-MOBO utilizes noisy evaluations of the RL stage cost and its gradient, estimated via a Compatible Deterministic Policy Gradient (CDPG) approach, and incorporates them into a MOBO algorithm using the Expected Hypervolume Improvement (EHVI) acquisition function. This fusion enables efficient and safe tuning of the MPC parameters to achieve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Control Systems Optimization · Fault Detection and Control Systems