Interaction-Grounded Learning
Tengyang Xie, John Langford, Paul Mineiro, Ida Momennejad

TL;DR
This paper introduces Interaction-Grounded Learning, a novel approach enabling agents to learn effective interaction policies without explicit rewards by discovering latent reward signals from multidimensional feedback, with theoretical guarantees and empirical validation.
Contribution
It proposes a new learning framework that identifies latent rewards from feedback without explicit rewards, applicable to complex interaction scenarios like prosthetic control.
Findings
Learners can successfully discover latent rewards under certain assumptions.
The approach provides theoretical guarantees of convergence.
Empirical results demonstrate effective policy grounding in practice.
Abstract
Consider a prosthetic arm, learning to adapt to its user's control signals. We propose Interaction-Grounded Learning for this novel setting, in which a learner's goal is to interact with the environment with no grounding or explicit reward to optimize its policies. Such a problem evades common RL solutions which require an explicit reward. The learning agent observes a multidimensional context vector, takes an action, and then observes a multidimensional feedback vector. This multidimensional feedback vector has no explicit reward information. In order to succeed, the algorithm must learn how to evaluate the feedback vector to discover a latent reward signal, with which it can ground its policies without supervision. We show that in an Interaction-Grounded Learning setting, with certain natural assumptions, a learner can discover the latent reward and ground its policy for successful…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Machine Learning and Algorithms · Speech and dialogue systems
