Reward Design with Language Models

Minae Kwon; Sang Michael Xie; Kalesha Bullard; Dorsa Sadigh

arXiv:2303.00001·cs.LG·March 2, 2023·21 cites

Reward Design with Language Models

Minae Kwon, Sang Michael Xie, Kalesha Bullard, Dorsa Sadigh

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper proposes a novel method for reward design in reinforcement learning by using large language models as proxy reward functions, enabling natural language-based reward specification and training of aligned agents.

Contribution

It introduces a framework that leverages LLMs for reward design, simplifying the process with natural language prompts and demonstrating effectiveness across multiple strategic tasks.

Findings

01

RL agents trained with LLM-based rewards align well with user objectives

02

The approach outperforms supervised reward learning in various tasks

03

Natural language prompts effectively guide agent behavior

Abstract

Reward design in reinforcement learning (RL) is challenging since specifying human notions of desired behavior may be difficult via reward functions or require many expert demonstrations. Can we instead cheaply design rewards using a natural language interface? This paper explores how to simplify reward design by prompting a large language model (LLM) such as GPT-3 as a proxy reward function, where the user provides a textual prompt containing a few examples (few-shot) or a description (zero-shot) of the desired behavior. Our approach leverages this proxy reward function in an RL framework. Specifically, users specify a prompt once at the beginning of training. During training, the LLM evaluates an RL agent's behavior against the desired behavior described by the prompt and outputs a corresponding reward signal. The RL agent then uses this reward to update its behavior. We evaluate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

minaek/reward_design_with_llms
pytorchOfficial

Videos

Reward Design with Language Models· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Topic Modeling

Methods{Dispute@FaQ-s}How to file a dispute with Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Attention Dropout · Adam · Cosine Annealing · Linear Warmup With Cosine Annealing · Layer Normalization