Beyond Pixels: Learning Invariant Rewards for Real-World Robotics From a Few Demonstrations

Tengye Xu; Yangting Sun; Ziju Shen; Guanqi Chen; Zhen Fu; Chen yizhou; Hua Chen; Jia Pan

arXiv:2605.22123·cs.RO·May 22, 2026

Beyond Pixels: Learning Invariant Rewards for Real-World Robotics From a Few Demonstrations

Tengye Xu, Yangting Sun, Ziju Shen, Guanqi Chen, Zhen Fu, Chen yizhou, Hua Chen, Jia Pan

PDF

TL;DR

This paper introduces a method for learning invariant, symbolic reward functions from minimal demonstrations, enabling better generalization and zero-shot transfer in real-world robotic manipulation tasks.

Contribution

It proposes a novel framework that shifts from visual feature fitting to discovering behavioral invariants, improving reward generalization across diverse task variants.

Findings

01

Achieves stronger process alignment and policy ranking on Meta-World and Franka tasks.

02

Demonstrates zero-shot generalization to position, viewpoint, and object variations.

03

Accelerates downstream policy learning with learned invariant rewards.

Abstract

Designing reward functions that generalize beyond controlled laboratory settings remains a fundamental challenge in reinforcement learning for robotics. In open-world manipulation problems, a single task can appear in numerous variants through different object instances, positions, and camera viewpoints. Recent vision-based reward models tend to memorize specific pixel distributions and fail to generalize beyond their training conditions. To address this, we propose a framework that learns invariant symbolic reward functions from as few as five demonstrations. The insight is to shift from visual feature-fitting to the discovery of behavioral invariants: task-level properties that remain constant across diverse visual instantiations. The framework has two coupled components: a structural reward formulation that encodes task-level strategies and physical constraints while preserving…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.