Video2Reward: Generating Reward Function from Videos for Legged Robot Behavior Learning

Runhao Zeng; Dingjie Zhou; Qiwei Liang; Junlin Liu; Hui Li; Changxin Huang; Jianqiang Li; Xiping Hu; Fuchun Sun

arXiv:2412.05515·cs.RO·July 1, 2025

Video2Reward: Generating Reward Function from Videos for Legged Robot Behavior Learning

Runhao Zeng, Dingjie Zhou, Qiwei Liang, Junlin Liu, Hui Li, Changxin Huang, Jianqiang Li, Xiping Hu, Fuchun Sun

PDF

1 Repo

TL;DR

This paper introduces a novel video2reward approach that generates reward functions directly from videos of behaviors, enabling more controllable and efficient learning of diverse legged robot motions, outperforming existing LLM-based methods.

Contribution

The paper presents a new method that converts videos into reward functions for robot learning, incorporating an iterative refinement scheme for improved accuracy and behavior diversity.

Findings

01

Outperforms state-of-the-art LLM-based reward methods by over 37.6% in human normalized score.

02

Enables rapid learning of diverse behaviors like walking and running.

03

Demonstrates effectiveness on both bipedal and quadrupedal robot tasks.

Abstract

Learning behavior in legged robots presents a significant challenge due to its inherent instability and complex constraints. Recent research has proposed the use of a large language model (LLM) to generate reward functions in reinforcement learning, thereby replacing the need for manually designed rewards by experts. However, this approach, which relies on textual descriptions to define learning objectives, fails to achieve controllable and precise behavior learning with clear directionality. In this paper, we introduce a new video2reward method, which directly generates reward functions from videos depicting the behaviors to be mimicked and learned. Specifically, we first process videos containing the target behaviors, converting the motion information of individuals in the videos into keypoint trajectories represented as coordinates through a video2text transforming module. These…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Alvin-Zeng/Video2Reward
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.