CANDERE-COACH: Reinforcement Learning from Noisy Feedback

Yuxuan Li; Srijita Das; Matthew E. Taylor

arXiv:2409.15521·cs.LG·September 25, 2024

CANDERE-COACH: Reinforcement Learning from Noisy Feedback

Yuxuan Li, Srijita Das, Matthew E. Taylor

PDF

Open Access

TL;DR

CANDERE-COACH introduces a reinforcement learning method that effectively learns from noisy binary feedback, filtering out errors to improve learning performance even when feedback accuracy drops to 60%.

Contribution

The paper presents a novel noise-filtering mechanism for RL from noisy binary feedback, enabling learning with up to 40% incorrect teacher signals.

Findings

01

Effective learning with 40% noisy feedback

02

Outperforms baseline methods in three domains

03

Demonstrates robustness to teacher feedback errors

Abstract

In recent times, Reinforcement learning (RL) has been widely applied to many challenging tasks. However, in order to perform well, it requires access to a good reward function which is often sparse or manually engineered with scope for error. Introducing human prior knowledge is often seen as a possible solution to the above-mentioned problem, such as imitation learning, learning from preference, and inverse reinforcement learning. Learning from feedback is another framework that enables an RL agent to learn from binary evaluative signals describing the teacher's (positive or negative) evaluation of the agent's action. However, these methods often make the assumption that evaluative teacher feedback is perfect, which is a restrictive assumption. In practice, such feedback can be noisy due to limited teacher expertise or other exacerbating factors like cognitive load, availability,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Reinforcement Learning in Robotics