Hybrid Reinforcement Learning Breaks Sample Size Barriers in Linear MDPs

Kevin Tan; Wei Fan; Yuting Wei

arXiv:2408.04526·stat.ML·August 9, 2024

Hybrid Reinforcement Learning Breaks Sample Size Barriers in Linear MDPs

Kevin Tan, Wei Fan, Yuting Wei

PDF

Open Access

TL;DR

This paper introduces efficient hybrid RL algorithms for linear MDPs that surpass existing sample complexity bounds in offline and online settings without relying on restrictive assumptions.

Contribution

It develops the first algorithms with sharp theoretical guarantees for hybrid RL in linear MDPs, improving sample complexity bounds without single-policy concentrability.

Findings

01

Achieves sharper error bounds in offline RL

02

Attains improved regret bounds in online RL

03

Establishes the tightest theoretical guarantees for hybrid RL in linear MDPs

Abstract

Hybrid Reinforcement Learning (RL), where an agent learns from both an offline dataset and online explorations in an unknown environment, has garnered significant recent interest. A crucial question posed by Xie et al. (2022) is whether hybrid RL can improve upon the existing lower bounds established in purely offline and purely online RL without relying on the single-policy concentrability assumption. While Li et al. (2023) provided an affirmative answer to this question in the tabular PAC RL case, the question remains unsettled for both the regret-minimizing RL case and the non-tabular case. In this work, building upon recent advancements in offline RL and reward-agnostic exploration, we develop computationally efficient algorithms for both PAC and regret-minimizing RL with linear function approximation, without single-policy concentrability. We demonstrate that these algorithms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics