Masked IRL: LLM-Guided Reward Disambiguation from Demonstrations and Language

Minyoung Hwang; Alexandra Forsey-Smerek; Nathaniel Dennler; Andreea Bobu

arXiv:2511.14565·cs.RO·April 1, 2026

Masked IRL: LLM-Guided Reward Disambiguation from Demonstrations and Language

Minyoung Hwang, Alexandra Forsey-Smerek, Nathaniel Dennler, Andreea Bobu

PDF

1 Repo

TL;DR

Masked IRL leverages large language models to combine demonstrations and language instructions, improving reward learning by disambiguating and focusing on relevant task aspects, leading to better generalization and efficiency.

Contribution

The paper introduces Masked IRL, a novel framework that uses LLMs to infer relevance masks and clarify ambiguous instructions, enhancing reward learning from limited data.

Findings

01

Outperforms prior language-conditioned IRL methods by up to 15%.

02

Uses up to 4.7 times less data for effective learning.

03

Improves sample-efficiency, generalization, and robustness to ambiguous language.

Abstract

Robots can adapt to user preferences by learning reward functions from demonstrations, but with limited data, reward models often overfit to spurious correlations and fail to generalize. This happens because demonstrations show robots how to do a task but not what matters for that task, causing the model to focus on irrelevant state details. Natural language can more directly specify what the robot should focus on, and, in principle, disambiguate between many reward functions consistent with the demonstrations. However, existing language-conditioned reward learning methods typically treat instructions as simple conditioning signals, without fully exploiting their potential to resolve ambiguity. Moreover, real instructions are often ambiguous themselves, so naive conditioning is unreliable. Our key insight is that these two input types carry complementary information: demonstrations show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

MIT-CLEAR-Lab/Masked-IRL
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.