Model-Free Robust $\phi$-Divergence Reinforcement Learning Using Both Offline and Online Data
Kishan Panaganti, Adam Wierman, Eric Mazumdar

TL;DR
This paper introduces model-free algorithms for robust reinforcement learning that leverage both offline and online data, providing theoretical guarantees for high-dimensional systems with function approximation.
Contribution
It proposes the first unified analysis for $eta$-divergence-based robust policies and introduces a hybrid framework combining offline and online data with new theoretical guarantees.
Findings
First unified analysis for $eta$-divergences in high-dimensional systems.
Introduction of hybrid offline-online robust RL framework.
Theoretical guarantees on policy performance in large state spaces.
Abstract
The robust -regularized Markov Decision Process (RRMDP) framework focuses on designing control policies that are robust against parameter uncertainties due to mismatches between the simulator (nominal) model and real-world settings. This work makes two important contributions. First, we propose a model-free algorithm called Robust -regularized fitted Q-iteration (RPQ) for learning an -optimal robust policy that uses only the historical data collected by rolling out a behavior policy (with robust exploratory requirement) on the nominal model. To the best of our knowledge, we provide the first unified analysis for a class of -divergences achieving robust optimal policies in high-dimensional systems with general function approximation. Second, we introduce the hybrid robust -regularized reinforcement learning framework to learn an optimal robust policy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Traffic control and management
