Unified Algorithms for RL with Decision-Estimation Coefficients: PAC,   Reward-Free, Preference-Based Learning, and Beyond

Fan Chen; Song Mei; Yu Bai

arXiv:2209.11745·cs.LG·December 24, 2024·1 cites

Unified Algorithms for RL with Decision-Estimation Coefficients: PAC, Reward-Free, Preference-Based Learning, and Beyond

Fan Chen, Song Mei, Yu Bai

PDF

Open Access

TL;DR

This paper introduces a unified algorithmic framework based on the Decision-Estimation Coefficient for efficiently addressing various reinforcement learning goals, including exploration, model estimation, and preference learning.

Contribution

It develops a generalized DEC framework that unifies multiple RL learning goals and provides a basis for new sample-efficient algorithms and lower bounds.

Findings

01

Unified framework covers multiple RL goals

02

New sample-efficient results for diverse learning tasks

03

Re-analysis of existing algorithms with DEC bounds

Abstract

Modern Reinforcement Learning (RL) is more than just learning the optimal policy; Alternative learning goals such as exploring the environment, estimating the underlying model, and learning from preference feedback are all of practical importance. While provably sample-efficient algorithms for each specific goal have been proposed, these algorithms often depend strongly on the particular learning goal and thus admit different structures correspondingly. It is an urging open question whether these learning goals can rather be tackled by a single unified algorithm. We make progress on this question by developing a unified algorithm framework for a large class of learning goals, building on the Decision-Estimation Coefficient (DEC) framework. Our framework handles many learning goals such as no-regret RL, PAC RL, reward-free learning, model estimation, and preference-based learning, all…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Causal Inference Techniques · Supply Chain and Inventory Management · Reinforcement Learning in Robotics