Learning Contextually-Adaptive Rewards via Calibrated Features
Alexandra Forsey-Smerek, Julie Shah, and Andreea Bobu

TL;DR
This paper introduces calibrated features, a modular approach to explicitly model context-dependent feature saliency in reward learning, significantly improving sample efficiency and personalization in human-in-the-loop settings.
Contribution
It proposes a novel method for explicitly learning context-dependent feature saliency separately from preferences, enhancing transferability and efficiency over existing implicit methods.
Findings
Requires 10x fewer preference queries than baselines
Achieves up to 15% better performance in low-data regimes
Enables personalized, adaptable reward learning in user studies
Abstract
A key challenge in reward learning from human input is that desired agent behavior often changes based on context. For example, a robot must adapt to avoid a stove once it becomes hot. We observe that while high-level preferences (e.g., prioritizing safety over efficiency) often remain constant, context alters the --or importance--of reward features. For instance, stove heat changes the relevance of the robot's proximity, not the underlying preference for safety. Moreover, these contextual effects recur across tasks, motivating the need for transferable representations to encode them. Existing multi-task and meta-learning methods simultaneously learn representations and task preferences, at best capturing contextual effects and requiring substantial data to separate them from task-specific preferences. Instead, we propose …
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
