Self-Refined Large Language Model as Automated Reward Function Designer   for Deep Reinforcement Learning in Robotics

Jiayang Song; Zhehua Zhou; Jiawei Liu; Chunrong Fang; Zhan Shu; Lei Ma

arXiv:2309.06687·cs.RO·October 3, 2023·5 cites

Self-Refined Large Language Model as Automated Reward Function Designer for Deep Reinforcement Learning in Robotics

Jiayang Song, Zhehua Zhou, Jiawei Liu, Chunrong Fang, Zhan Shu, Lei Ma

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel LLM-based framework with self-refinement for automated reward function design in deep reinforcement learning, significantly reducing manual effort and improving performance in robotic control tasks.

Contribution

The work presents a self-refining LLM framework that automatically generates and improves reward functions for robotic reinforcement learning, outperforming manual designs.

Findings

01

LLM-designed reward functions match or surpass manual ones.

02

The framework is effective across multiple robotic systems.

03

Automated reward design reduces manual effort in RL.

Abstract

Although Deep Reinforcement Learning (DRL) has achieved notable success in numerous robotic applications, designing a high-performing reward function remains a challenging task that often requires substantial manual input. Recently, Large Language Models (LLMs) have been extensively adopted to address tasks demanding in-depth common-sense knowledge, such as reasoning and planning. Recognizing that reward function design is also inherently linked to such knowledge, LLM offers a promising potential in this context. Motivated by this, we propose in this work a novel LLM framework with a self-refinement mechanism for automated reward function design. The framework commences with the LLM formulating an initial reward function based on natural language inputs. Then, the performance of the reward function is assessed, and the results are presented back to the LLM for guiding its…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhehuazhou/llm_reward_design
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Software Engineering Research · Reinforcement Learning in Robotics