A General Framework for Sample-Efficient Function Approximation in   Reinforcement Learning

Zixiang Chen; Chris Junchi Li; Angela Yuan; Quanquan Gu; Michael I.; Jordan

arXiv:2209.15634·cs.LG·October 3, 2022·1 cites

A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning

Zixiang Chen, Chris Junchi Li, Angela Yuan, Quanquan Gu, Michael I., Jordan

PDF

Open Access

TL;DR

This paper introduces a unified framework for efficient function approximation in reinforcement learning, combining model-based and model-free approaches, and proposes a new algorithm with improved sample complexity for various MDP models.

Contribution

It presents a general framework and a novel algorithm, OPERA, that achieves improved sample efficiency and regret bounds across diverse MDP settings.

Findings

01

OPERA achieves regret bounds matching or surpassing existing methods.

02

For low Witness rank MDPs, OPERA improves sample complexity by a factor of dH.

03

The framework unifies and extends analysis for multiple RL models.

Abstract

With the increasing need for handling large state and action spaces, general function approximation has become a key technique in reinforcement learning (RL). In this paper, we propose a general framework that unifies model-based and model-free RL, and an Admissible Bellman Characterization (ABC) class that subsumes nearly all Markov Decision Process (MDP) models in the literature for tractable RL. We propose a novel estimation function with decomposable structural properties for optimization-based exploration and the functional eluder dimension as a complexity measure of the ABC class. Under our framework, a new sample-efficient algorithm namely OPtimization-based ExploRation with Approximation (OPERA) is proposed, achieving regret bounds that match or improve over the best-known results for a variety of MDP models. In particular, for MDPs with low Witness rank, under a slightly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Mental Health Research Topics

MethodsApproximate Bayesian Computation