The Effect of Modeling Human Rationality Level on Learning Rewards from Multiple Feedback Types
Gaurav R. Ghosal, Matthew Zurek, Daniel S. Brown, Anca D. Dragan

TL;DR
This paper demonstrates that modeling human rationality levels based on real data significantly improves reward learning from various feedback types, highlighting the importance of adaptive rationality modeling in human-in-the-loop systems.
Contribution
It introduces a data-driven approach to set human rationality levels for different feedback types, improving reward inference accuracy over fixed assumptions.
Findings
Fitting rationality coefficients to human data enhances reward learning.
Overestimating human rationality can harm reward inference accuracy.
Comparisons can be more informative than demonstrations when humans act suboptimally.
Abstract
When inferring reward functions from human behavior (be it demonstrations, comparisons, physical corrections, or e-stops), it has proven useful to model the human as making noisy-rational choices, with a "rationality coefficient" capturing how much noise or entropy we expect to see in the human behavior. Prior work typically sets the rationality level to a constant value, regardless of the type, or quality, of human feedback. However, in many settings, giving one type of feedback (e.g. a demonstration) may be much more difficult than a different type of feedback (e.g. answering a comparison query). Thus, we expect to see more or less noise depending on the type of human feedback. In this work, we advocate that grounding the rationality coefficient in real data for each feedback type, rather than assuming a default value, has a significant positive effect on reward learning. We test this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural and Behavioral Psychology Studies · Decision-Making and Behavioral Economics · Mental Health Research Topics
MethodsTest
