Learning Reward for Physical Skills using Large Language Model

Yuwei Zeng; Yiqing Xu

arXiv:2310.14092·cs.RO·October 24, 2023·1 cites

Learning Reward for Physical Skills using Large Language Model

Yuwei Zeng, Yiqing Xu

PDF

Open Access

TL;DR

This paper introduces a method to leverage large language models to generate and iteratively refine reward functions for physical skill learning, addressing challenges of high-dimensionality and costly data collection.

Contribution

The paper presents a novel approach combining LLMs with environment feedback to create and optimize reward functions for physical skills, improving learning efficiency.

Findings

01

Effective reward functions generated for simulated physical tasks

02

Iterative self-alignment reduces ranking inconsistency

03

Method demonstrates improved learning support in simulations

Abstract

Learning reward functions for physical skills are challenging due to the vast spectrum of skills, the high-dimensionality of state and action space, and nuanced sensory feedback. The complexity of these tasks makes acquiring expert demonstration data both costly and time-consuming. Large Language Models (LLMs) contain valuable task-related knowledge that can aid in learning these reward functions. However, the direct application of LLMs for proposing reward functions has its limitations such as numerical instability and inability to incorporate the environment feedback. We aim to extract task knowledge from LLMs using environment feedback to create efficient reward functions for physical skills. Our approach consists of two components. We first use the LLM to propose features and parameterization of the reward function. Next, we update the parameters of this proposed reward function…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling