Learning to Influence Human Behavior with Offline Reinforcement Learning
Joey Hong, Sergey Levine, Anca Dragan

TL;DR
This paper explores how offline reinforcement learning can be used to influence human behavior effectively, especially when humans are suboptimal, by learning from real human interaction data without online experimentation.
Contribution
It introduces a method for offline RL to learn influence strategies from human data, addressing suboptimality and strategy adaptation in human-AI interactions.
Findings
Offline RL can learn influence strategies from suboptimal human data.
Agents can steer humans towards better performance on new tasks.
Modeling human behavior enables influence on underlying human strategies.
Abstract
When interacting with people, AI agents do not just influence the state of the world -- they also influence the actions people take in response to the agent, and even their underlying intentions and strategies. Accounting for and leveraging this influence has mostly been studied in settings where it is sufficient to assume that human behavior is near-optimal: competitive games, or general-sum settings like autonomous driving alongside human drivers. Instead, we focus on influence in settings where there is a need to capture human suboptimality. For instance, imagine a collaborative task in which, due either to cognitive biases or lack of information, people do not perform very well -- how could an agent influence them towards more optimal behavior? Assuming near-optimal human behavior will not work here, and so the agent needs to learn from real human data. But experimenting online with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Ethics and Social Impacts of AI · Explainable Artificial Intelligence (XAI)
