Representation Alignment from Human Feedback for Cross-Embodiment Reward Learning from Mixed-Quality Demonstrations
Connor Mattson, Anurag Aribandi, Daniel S. Brown

TL;DR
This paper explores how to learn transferable reward functions from mixed-quality video demonstrations across different embodiments, emphasizing the role of human feedback in aligning representations for effective cross-embodiment reinforcement learning.
Contribution
It introduces techniques leveraging human feedback to improve representation alignment and reward transfer in cross-embodiment inverse reinforcement learning from mixed-quality demonstrations.
Findings
Prior methods struggle with mixed-quality data.
Human feedback improves reward representation alignment.
Different techniques lead to qualitatively different reward behaviors.
Abstract
We study the problem of cross-embodiment inverse reinforcement learning, where we wish to learn a reward function from video demonstrations in one or more embodiments and then transfer the learned reward to a different embodiment (e.g., different action space, dynamics, size, shape, etc.). Learning reward functions that transfer across embodiments is important in settings such as teaching a robot a policy via human video demonstrations or teaching a robot to imitate a policy from another robot with a different embodiment. However, prior work has only focused on cases where near-optimal demonstrations are available, which is often difficult to ensure. By contrast, we study the setting of cross-embodiment reward learning from mixed-quality demonstrations. We demonstrate that prior work struggles to learn generalizable reward representations when learning from mixed-quality data. We then…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Reinforcement Learning in Robotics · Human Pose and Action Recognition
