Beyond Pixels: Learning Invariant Rewards for Real-World Robotics From a Few Demonstrations
Tengye Xu, Yangting Sun, Ziju Shen, Guanqi Chen, Zhen Fu, Chen yizhou, Hua Chen, Jia Pan

TL;DR
This paper introduces a method for learning invariant, symbolic reward functions from minimal demonstrations, enabling better generalization and zero-shot transfer in real-world robotic manipulation tasks.
Contribution
It proposes a novel framework that shifts from visual feature fitting to discovering behavioral invariants, improving reward generalization across diverse task variants.
Findings
Achieves stronger process alignment and policy ranking on Meta-World and Franka tasks.
Demonstrates zero-shot generalization to position, viewpoint, and object variations.
Accelerates downstream policy learning with learned invariant rewards.
Abstract
Designing reward functions that generalize beyond controlled laboratory settings remains a fundamental challenge in reinforcement learning for robotics. In open-world manipulation problems, a single task can appear in numerous variants through different object instances, positions, and camera viewpoints. Recent vision-based reward models tend to memorize specific pixel distributions and fail to generalize beyond their training conditions. To address this, we propose a framework that learns invariant symbolic reward functions from as few as five demonstrations. The insight is to shift from visual feature-fitting to the discovery of behavioral invariants: task-level properties that remain constant across diverse visual instantiations. The framework has two coupled components: a structural reward formulation that encodes task-level strategies and physical constraints while preserving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
