Boosting Robustness in Preference-Based Reinforcement Learning with Dynamic Sparsity

Calarina Muslimani; Bram Grooten; Deepak Ranganatha Sastry Mamillapalli; Mykola Pechenizkiy; Decebal Constantin Mocanu; Matthew E. Taylor

arXiv:2406.06495·cs.LG·July 8, 2025

Boosting Robustness in Preference-Based Reinforcement Learning with Dynamic Sparsity

Calarina Muslimani, Bram Grooten, Deepak Ranganatha Sastry Mamillapalli, Mykola Pechenizkiy, Decebal Constantin Mocanu, Matthew E. Taylor

PDF

Open Access

TL;DR

This paper introduces R2N, a novel preference-based reinforcement learning algorithm that employs dynamic sparse training to focus on task-relevant features, improving robustness and performance in diverse environments.

Contribution

R2N is the first PbRL algorithm to use dynamic sparse training for focusing on task-relevant features, enhancing robustness and adaptability.

Findings

01

R2N outperforms existing sparse training and PbRL algorithms in simulated robotic tasks.

02

R2N adapts its neural network connectivity to focus on relevant features.

03

Experimental results show significant performance improvements with R2N.

Abstract

To integrate into human-centered environments, autonomous agents must learn from and adapt to humans in their native settings. Preference-based reinforcement learning (PbRL) can enable this by learning reward functions from human preferences. However, humans live in a world full of diverse information, most of which is irrelevant to completing any particular task. It then becomes essential that agents learn to focus on the subset of task-relevant state features. To that end, this work proposes R2N (Robust-to-Noise), the first PbRL algorithm that leverages principles of dynamic sparse training to learn robust reward models that can focus on task-relevant features. In experiments with a simulated teacher, we demonstrate that R2N can adapt the sparse connectivity of its neural networks to focus on task-relevant features, enabling R2N to significantly outperform several sparse training and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Criteria Decision Making

MethodsFocus