Comparison and Unification of Three Regularization Methods in Batch Reinforcement Learning
Sarah Rathnam, Susan A. Murphy, and Finale Doshi-Velez

TL;DR
This paper unifies three regularization methods in batch reinforcement learning into a common framework, revealing how data distribution and MDP structure influence their effectiveness, supported by empirical evaluations across various scenarios.
Contribution
It introduces a unified framework for three regularization methods in batch RL, clarifying their differences and interactions within a common mathematical form.
Findings
Regularization methods can be compared within a common weighted average transition matrix framework.
The effectiveness of each regularization method depends on the MDP structure and data distribution.
Empirical results confirm the theoretical insights across multiple MDPs and data policies.
Abstract
In batch reinforcement learning, there can be poorly explored state-action pairs resulting in poorly learned, inaccurate models and poorly performing associated policies. Various regularization methods can mitigate the problem of learning overly-complex models in Markov decision processes (MDPs), however they operate in technically and intuitively distinct ways and lack a common form in which to compare them. This paper unifies three regularization methods in a common framework -- a weighted average transition matrix. Considering regularization methods in this common form illuminates how the MDP structure and the state-action pair distribution of the batch data set influence the relative performance of regularization methods. We confirm intuitions generated from the common framework by empirical evaluation across a range of MDPs and data collection policies.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Machine Learning and Algorithms · Data Stream Mining Techniques
