Mapping out the Space of Human Feedback for Reinforcement Learning: A Conceptual Framework
Yannick Metz, David Lindner, Rapha\"el Baur, Mennatallah El-Assady

TL;DR
This paper develops a comprehensive framework for understanding and categorizing human feedback in reinforcement learning, aiming to improve system design and identify research gaps.
Contribution
It introduces a taxonomy of human feedback types based on nine dimensions and identifies seven quality metrics affecting learning effectiveness.
Findings
Proposes a unified taxonomy for human feedback in RL.
Identifies key quality metrics influencing feedback effectiveness.
Highlights gaps and future directions in human-in-the-loop RL research.
Abstract
Reinforcement Learning from Human feedback (RLHF) has become a powerful tool to fine-tune or train agentic machine learning models. Similar to how humans interact in social contexts, we can use many types of feedback to communicate our preferences, intentions, and knowledge to an RL agent. However, applications of human feedback in RL are often limited in scope and disregard human factors. In this work, we bridge the gap between machine learning and human-computer interaction efforts by developing a shared understanding of human feedback in interactive learning scenarios. We first introduce a taxonomy of feedback types for reward-based learning from human feedback based on nine key dimensions. Our taxonomy allows for unifying human-centered, interface-centered, and model-centered aspects. In addition, we identify seven quality metrics of human feedback influencing both the human ability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Systems and Decision Making
