Optimal Cycling of a Heterogenous Battery Bank via Reinforcement Learning
Vivek Deulkar, Jayakrishnan Nair

TL;DR
This paper develops a reinforcement learning approach to optimize the charging and discharging of a heterogeneous battery bank, aiming to minimize long-term degradation costs amid stochastic energy supply and demand.
Contribution
It introduces a novel linear function approximation Q-learning algorithm with specialized kernel functions tailored for battery management in a stochastic environment.
Findings
The algorithm effectively learns optimal cycling policies.
Validation shows significant cost reduction in case studies.
The method handles heterogeneity and stochasticity efficiently.
Abstract
We consider the problem of optimal charging/discharging of a bank of heterogenous battery units, driven by stochastic electricity generation and demand processes. The batteries in the battery bank may differ with respect to their capacities, ramp constraints, losses, as well as cycling costs. The goal is to minimize the degradation costs associated with battery cycling in the long run; this is posed formally as a Markov decision process. We propose a linear function approximation based Q-learning algorithm for learning the optimal solution, using a specially designed class of kernel functions that approximate the structure of the value functions associated with the MDP. The proposed algorithm is validated via an extensive case study.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Battery Technologies Research · Electric Vehicles and Infrastructure · Advancements in Battery Materials
MethodsQ-Learning
