MPC-based Reinforcement Learning for Economic Problems with Application   to Battery Storage

Arash Bahari Kordabad; Wenqi Cai; Sebastien Gros

arXiv:2104.02411·cs.LG·April 7, 2021

MPC-based Reinforcement Learning for Economic Problems with Application to Battery Storage

Arash Bahari Kordabad, Wenqi Cai, Sebastien Gros

PDF

TL;DR

This paper introduces a homotopy-based MPC reinforcement learning approach for economic control problems, demonstrating improved learning efficiency in battery storage applications with nearly bang-bang policies.

Contribution

It proposes a novel homotopy strategy to enhance policy gradient methods for bang-bang structured policies in MPC-based reinforcement learning.

Findings

01

Faster convergence compared to classical policy gradient methods.

02

Effective handling of bang-bang policy structures.

03

Successful application to battery storage control problem.

Abstract

In this paper, we are interested in optimal control problems with purely economic costs, which often yield optimal policies having a (nearly) bang-bang structure. We focus on policy approximations based on Model Predictive Control (MPC) and the use of the deterministic policy gradient method to optimize the MPC closed-loop performance in the presence of unmodelled stochasticity or model error. When the policy has a (nearly) bang-bang structure, we observe that the policy gradient method can struggle to produce meaningful steps in the policy parameters. To tackle this issue, we propose a homotopy strategy based on the interior-point method, providing a relaxation of the policy during the learning. We investigate a specific well-known battery storage problem, and show that the proposed method delivers a homogeneous and faster learning than a classical policy gradient approach.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.