Interactive Groupwise Comparison for Reinforcement Learning from Human Feedback

Jan Kompatscher; Danqing Shi; Giovanna Varni; Tino Weinkauf; Antti Oulasvirta

arXiv:2507.04340·cs.LG·December 1, 2025

Interactive Groupwise Comparison for Reinforcement Learning from Human Feedback

Jan Kompatscher, Danqing Shi, Giovanna Varni, Tino Weinkauf, Antti Oulasvirta

PDF

Open Access

TL;DR

This paper introduces an interactive visualization tool and active learning approach for reinforcement learning from human feedback, enabling more efficient exploration and comparison of behaviors to improve AI alignment.

Contribution

It presents a novel interactive visualization interface combined with active learning to enhance data collection in RLHF, improving policy quality and reward outcomes.

Findings

01

Increased final rewards by 69.34% in six robotics tasks

02

Lower error rates in behavior comparison

03

Better policies achieved through the proposed method

Abstract

Reinforcement learning from human feedback (RLHF) has emerged as a key enabling technology for aligning AI behaviour with human preferences. The traditional way to collect data in RLHF is via pairwise comparisons: human raters are asked to indicate which one of two samples they prefer. We present an interactive visualisation that better exploits the human visual ability to compare and explore whole groups of samples. The interface is comprised of two linked views: 1) an exploration view showing a contextual overview of all sampled behaviours organised in a hierarchical clustering structure; and 2) a comparison view displaying two selected groups of behaviours for user queries. Users can efficiently explore large sets of behaviours by iterating between these two views. Additionally, we devised an active learning approach suggesting groups for comparison. As shown by our evaluation in six…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications