ReWiND: Language-Guided Rewards Teach Robot Policies without New Demonstrations

Jiahui Zhang; Yusen Luo; Abrar Anwar; Sumedh Anand Sontakke; Joseph J Lim; Jesse Thomason; Erdem Biyik; Jesse Zhang

arXiv:2505.10911·cs.RO·September 23, 2025

ReWiND: Language-Guided Rewards Teach Robot Policies without New Demonstrations

Jiahui Zhang, Yusen Luo, Abrar Anwar, Sumedh Anand Sontakke, Joseph J Lim, Jesse Thomason, Erdem Biyik, Jesse Zhang

PDF

Open Access

TL;DR

ReWiND is a novel framework that enables robots to learn manipulation tasks from language instructions alone, using a small dataset to generate rewards and fine-tune policies, significantly reducing the need for demonstrations.

Contribution

ReWiND introduces a language-conditioned reward learning approach that generalizes to unseen tasks and enables sample-efficient policy adaptation without new demonstrations.

Findings

01

Outperforms baselines by up to 2.4x in reward generalization

02

Achieves 2x faster adaptation in simulation

03

Improves real-world policy performance by 5x

Abstract

We introduce ReWiND, a framework for learning robot manipulation tasks solely from language instructions without per-task demonstrations. Standard reinforcement learning (RL) and imitation learning methods require expert supervision through human-designed reward functions or demonstrations for every new task. In contrast, ReWiND starts from a small demonstration dataset to learn: (1) a data-efficient, language-conditioned reward function that labels the dataset with rewards, and (2) a language-conditioned policy pre-trained with offline RL using these rewards. Given an unseen task variation, ReWiND fine-tunes the pre-trained policy using the learned reward function, requiring minimal online interaction. We show that ReWiND's reward model generalizes effectively to unseen tasks, outperforming baselines by up to 2.4x in reward generalization and policy alignment metrics. Finally, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Multimodal Machine Learning Applications