Mapping out the Space of Human Feedback for Reinforcement Learning: A   Conceptual Framework

Yannick Metz; David Lindner; Rapha\"el Baur; Mennatallah El-Assady

arXiv:2411.11761·cs.LG·February 21, 2025

Mapping out the Space of Human Feedback for Reinforcement Learning: A Conceptual Framework

Yannick Metz, David Lindner, Rapha\"el Baur, Mennatallah El-Assady

PDF

Open Access

TL;DR

This paper develops a comprehensive framework for understanding and categorizing human feedback in reinforcement learning, aiming to improve system design and identify research gaps.

Contribution

It introduces a taxonomy of human feedback types based on nine dimensions and identifies seven quality metrics affecting learning effectiveness.

Findings

01

Proposes a unified taxonomy for human feedback in RL.

02

Identifies key quality metrics influencing feedback effectiveness.

03

Highlights gaps and future directions in human-in-the-loop RL research.

Abstract

Reinforcement Learning from Human feedback (RLHF) has become a powerful tool to fine-tune or train agentic machine learning models. Similar to how humans interact in social contexts, we can use many types of feedback to communicate our preferences, intentions, and knowledge to an RL agent. However, applications of human feedback in RL are often limited in scope and disregard human factors. In this work, we bridge the gap between machine learning and human-computer interaction efforts by developing a shared understanding of human feedback in interactive learning scenarios. We first introduce a taxonomy of feedback types for reward-based learning from human feedback based on nine key dimensions. Our taxonomy allows for unifying human-centered, interface-centered, and model-centered aspects. In addition, we identify seven quality metrics of human feedback influencing both the human ability…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComplex Systems and Decision Making