Advances in Preference-based Reinforcement Learning: A Review

Youssef Abdelkareem; Shady Shehata; Fakhri Karray

arXiv:2408.11943·cs.AI·August 23, 2024

Advances in Preference-based Reinforcement Learning: A Review

Youssef Abdelkareem, Shady Shehata, Fakhri Karray

PDF

TL;DR

This survey reviews recent advances in preference-based reinforcement learning, highlighting new approaches, theoretical guarantees, benchmarking efforts, and real-world applications, while discussing current limitations and future directions.

Contribution

It provides a unified framework for recent PbRL methods, summarizes theoretical and benchmarking progress, and discusses practical applications and future research challenges.

Findings

01

Improved scalability and efficiency in PbRL methods

02

Theoretical guarantees established for several approaches

03

PbRL successfully applied to complex real-world tasks

Abstract

Reinforcement Learning (RL) algorithms suffer from the dependency on accurately engineered reward functions to properly guide the learning agents to do the required tasks. Preference-based reinforcement learning (PbRL) addresses that by utilizing human preferences as feedback from the experts instead of numeric rewards. Due to its promising advantage over traditional RL, PbRL has gained more focus in recent years with many significant advances. In this survey, we present a unified PbRL framework to include the newly emerging approaches that improve the scalability and efficiency of PbRL. In addition, we give a detailed overview of the theoretical guarantees and benchmarking work done in the field, while presenting its recent applications in complex real-world tasks. Lastly, we go over the limitations of the current approaches and the proposed future research directions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsFocus