Navigating Noisy Feedback: Enhancing Reinforcement Learning with   Error-Prone Language Models

Muhan Lin; Shuyang Shi; Yue Guo; Behdad Chalaki; Vaishnav Tadiparthi,; Ehsan Moradi Pari; Simon Stepputtis; Joseph Campbell; Katia Sycara

arXiv:2410.17389·cs.AI·October 24, 2024

Navigating Noisy Feedback: Enhancing Reinforcement Learning with Error-Prone Language Models

Muhan Lin, Shuyang Shi, Yue Guo, Behdad Chalaki, Vaishnav Tadiparthi,, Ehsan Moradi Pari, Simon Stepputtis, Joseph Campbell, Katia Sycara

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper explores how to improve reinforcement learning by using feedback from large language models, addressing errors and hallucinations, and proposes a simple method that enhances learning efficiency and policy quality.

Contribution

It introduces a reward shaping method using language model feedback that is robust to errors and improves RL convergence and performance.

Findings

01

Empirically improves convergence speed and policy returns.

02

Effective even with significant ranking errors.

03

Eliminates need for complex reward post-processing.

Abstract

The correct specification of reward models is a well-known challenge in reinforcement learning. Hand-crafted reward functions often lead to inefficient or suboptimal policies and may not be aligned with user values. Reinforcement learning from human feedback is a successful technique that can mitigate such issues, however, the collection of human feedback can be laborious. Recent works have solicited feedback from pre-trained large language models rather than humans to reduce or eliminate human effort, however, these approaches yield poor performance in the presence of hallucination and other errors. This paper studies the advantages and limitations of reinforcement learning from large language model feedback and proposes a simple yet effective method for soliciting and applying feedback as a potential-based shaping function. We theoretically show that inconsistent rankings, which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sy-shi/RLAIF_ScoreDiff
pytorchOfficial

Videos

Navigating Noisy Feedback: Enhancing Reinforcement Learning with Error-Prone Language Models· underline

Taxonomy

TopicsReinforcement Learning in Robotics

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings