A Learning-based Optimal Market Bidding Strategy for Price-Maker Energy Storage
Mathilde D. Badoual, Scott J. Moura

TL;DR
This paper introduces an online supervised Actor-Critic reinforcement learning algorithm for energy storage bidding in electricity markets, which learns to optimize profits while considering market impact, outperforming traditional model-based methods.
Contribution
The paper presents a novel online supervised Actor-Critic algorithm supervised by Model Predictive Control, improving bidding strategies for energy storage in electricity markets.
Findings
Supervised Actor-Critic outperforms MPC in profit generation.
The algorithm adapts to market impact during training.
It provides a safer, data-efficient alternative to traditional RL methods.
Abstract
Load serving entities with storage units reach sizes and performances that can significantly impact clearing prices in electricity markets. Nevertheless, price endogeneity is rarely considered in storage bidding strategies and modeling the electricity market is a challenging task. Meanwhile, model-free reinforcement learning such as the Actor-Critic are becoming increasingly popular for designing energy system controllers. Yet implementation frequently requires lengthy, data-intense, and unsafe trial-and-error training. To fill these gaps, we implement an online Supervised Actor-Critic (SAC) algorithm, supervised with a model-based controller -- Model Predictive Control (MPC). The energy storage agent is trained with this algorithm to optimally bid while learning and adjusting to its impact on the market clearing prices. We compare the supervised Actor-Critic algorithm with the MPC…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDilated Convolution · Convolution · Average Pooling · Global Average Pooling · 1x1 Convolution · Switchable Atrous Convolution
