GIFT: Generalizing Intent for Flexible Test-Time Rewards
Fin Amin, Nathaniel Dennler, Andreea Bobu

TL;DR
GIFT introduces a framework that leverages language models to infer human intent from demonstrations, enabling reward functions to generalize to new environments and objects without retraining.
Contribution
GIFT is the first method to ground reward generalization in human intent using language models, improving robustness across distribution shifts in robotic tasks.
Findings
Outperforms visual and semantic similarity baselines in simulated tasks
Achieves reliable transfer to real-world robotic manipulation
Demonstrates robustness across over 50 unseen objects
Abstract
Robots learn reward functions from user demonstrations, but these rewards often fail to generalize to new environments. This failure occurs because learned rewards latch onto spurious correlations in training data rather than the underlying human intent that demonstrations represent. Existing methods leverage visual or semantic similarity to improve robustness, yet these surface-level cues often diverge from what humans actually care about. We present Generalizing Intent for Flexible Test-Time Rewards (GIFT), a framework that grounds reward generalization in human intent rather than surface cues. GIFT leverages language models to infer high-level intent from user demonstrations by contrasting preferred with non-preferred behaviors. At deployment, GIFT maps novel test states to behaviorally equivalent training states via intent-conditioned similarity, enabling learned rewards to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Social Robot Interaction and HRI
