Learning to Assist Humans without Inferring Rewards
Vivek Myers, Evan Ellis, Sergey Levine, Benjamin Eysenbach, Anca, Dragan

TL;DR
This paper introduces a scalable method for assistive agents to enhance human influence without inferring explicit rewards, using contrastive successor representations to improve assistance in high-dimensional environments.
Contribution
It presents a novel contrastive successor representation approach that estimates empowerment, enabling scalable assistance without reward inference in complex settings.
Findings
Outperforms prior methods on synthetic benchmarks
Successfully scales to the Overcooked cooperative game
Provides theoretical connections between information theory and reinforcement learning
Abstract
Assistive agents should make humans' lives easier. Classically, such assistance is studied through the lens of inverse reinforcement learning, where an assistive agent (e.g., a chatbot, a robot) infers a human's intention and then selects actions to help the human reach that goal. This approach requires inferring intentions, which can be difficult in high-dimensional settings. We build upon prior work that studies assistance through the lens of empowerment: an assistive agent aims to maximize the influence of the human's actions such that they exert a greater control over the environmental outcomes and can solve tasks in fewer steps. We lift the major limitation of prior work in this area--scalability to high-dimensional settings--with contrastive successor representations. We formally prove that these representations estimate a similar notion of empowerment to that studied by prior…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsProblem and Project Based Learning
