Convergent NMPC-based Reinforcement Learning Using Deep Expected Sarsa   and Nonlinear Temporal Difference Learning

Amine Salaje; Thomas Chevet; Nicolas Langlois

arXiv:2502.04925·eess.SY·April 23, 2025

Convergent NMPC-based Reinforcement Learning Using Deep Expected Sarsa and Nonlinear Temporal Difference Learning

Amine Salaje, Thomas Chevet, Nicolas Langlois

PDF

Open Access

TL;DR

This paper introduces a novel reinforcement learning approach integrating deep Expected Sarsa with nonlinear model predictive control, improving stability and reducing computational load for optimal control tasks.

Contribution

It proposes two methods combining RL with NMPC, enhancing stability, convergence, and real-time performance in nonlinear control systems.

Findings

01

Approach converges to a local optimum without instability.

02

Neural network reduces computational burden by half.

03

Method stabilizes learning in nonlinear environments.

Abstract

In this paper, we present a learning-based nonlinear model predictive controller (NMPC) using an original reinforcement learning (RL) method to learn the optimal weights of the NMPC scheme, for which two methods are proposed. Firstly, the controller is used as the current action-value function of a deep Expected Sarsa where the subsequent action-value function, usually obtained with a secondary NMPC, is approximated with a neural network (NN). With respect to existing methods, we add to the NN's input the current value of the NMPC's learned parameters so that the network is able to approximate the action-value function and stabilize the learning performance. Additionally, with the use of the NN, the real-time computational burden is approximately halved without affecting the closed-loop performance. Secondly, we combine gradient temporal difference methods with a parametrized NMPC as a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and ELM

MethodsExpected Sarsa · Sarsa