RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model   Feedback

Yufei Wang; Zhanyi Sun; Jesse Zhang; Zhou Xian; Erdem Biyik; David; Held; Zackory Erickson

arXiv:2402.03681·cs.RO·June 18, 2024·6 cites

RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model Feedback

Yufei Wang, Zhanyi Sun, Jesse Zhang, Zhou Xian, Erdem Biyik, David, Held, Zackory Erickson

PDF

Open Access 1 Repo

TL;DR

RL-VLM-F introduces an automatic reward generation method for reinforcement learning that leverages vision-language foundation models to interpret task goals from text and visual observations, reducing human effort.

Contribution

It proposes a novel approach to generate reward functions from VLM feedback based on preferences, eliminating manual reward engineering in RL tasks.

Findings

01

Effective reward functions learned across diverse domains

02

Outperforms prior methods in reward generation accuracy

03

Operates without human supervision

Abstract

Reward engineering has long been a challenge in Reinforcement Learning (RL) research, as it often requires extensive human effort and iterative processes of trial-and-error to design effective reward functions. In this paper, we propose RL-VLM-F, a method that automatically generates reward functions for agents to learn new tasks, using only a text description of the task goal and the agent's visual observations, by leveraging feedbacks from vision language foundation models (VLMs). The key to our approach is to query these models to give preferences over pairs of the agent's image observations based on the text description of the task goal, and then learn a reward function from the preference labels, rather than directly prompting these models to output a raw reward score, which can be noisy and inconsistent. We demonstrate that RL-VLM-F successfully produces effective rewards and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yufeiwang63/rl-vlm-f
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Automated Systems · Multimodal Machine Learning Applications · Fuzzy Logic and Control Systems