Evaluating Feature Dependent Noise in Preference-based Reinforcement Learning
Yuxuan Li, Harshith Reddy Kethireddy, Srijita Das

TL;DR
This paper investigates feature-dependent noise in preference-based reinforcement learning, revealing that certain noise types significantly impair performance and that simple methods can outperform specialized noise-robust algorithms.
Contribution
It formalizes feature-dependent noise, introduces variants, and evaluates their impact on PbRL, highlighting the limitations of existing noise-robust methods and the relevance of language model noise.
Findings
Feature-dependent noise can significantly reduce PbRL performance.
Simple PbRL methods can outperform noise-robust algorithms under certain noise conditions.
Language model noise shares characteristics with feature-dependent noise, indicating realistic human-like noise patterns.
Abstract
Learning from Preferences in Reinforcement Learning (PbRL) has gained attention recently, as it serves as a natural fit for complicated tasks where the reward function is not easily available. However, preferences often come with uncertainty and noise if they are not from perfect teachers. Much prior literature aimed to detect noise, but with limited types of noise and most being uniformly distributed with no connection to observations. In this work, we formalize the notion of targeted feature-dependent noise and propose several variants like trajectory feature noise, trajectory similarity noise, margin dependent noise, and Language Model noise. We evaluate feature-dependent noise, where noise is correlated with certain features in complex continuous control tasks from DMControl and Meta-world. Our experiments show that in some feature-dependent noise settings, the state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Machine Learning and Data Classification · Explainable Artificial Intelligence (XAI)
