Reward Design with Language Models
Minae Kwon, Sang Michael Xie, Kalesha Bullard, Dorsa Sadigh

TL;DR
This paper proposes a novel method for reward design in reinforcement learning by using large language models as proxy reward functions, enabling natural language-based reward specification and training of aligned agents.
Contribution
It introduces a framework that leverages LLMs for reward design, simplifying the process with natural language prompts and demonstrating effectiveness across multiple strategic tasks.
Findings
RL agents trained with LLM-based rewards align well with user objectives
The approach outperforms supervised reward learning in various tasks
Natural language prompts effectively guide agent behavior
Abstract
Reward design in reinforcement learning (RL) is challenging since specifying human notions of desired behavior may be difficult via reward functions or require many expert demonstrations. Can we instead cheaply design rewards using a natural language interface? This paper explores how to simplify reward design by prompting a large language model (LLM) such as GPT-3 as a proxy reward function, where the user provides a textual prompt containing a few examples (few-shot) or a description (zero-shot) of the desired behavior. Our approach leverages this proxy reward function in an RL framework. Specifically, users specify a prompt once at the beginning of training. During training, the LLM evaluates an RL agent's behavior against the desired behavior described by the prompt and outputs a corresponding reward signal. The RL agent then uses this reward to update its behavior. We evaluate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Topic Modeling
Methods{Dispute@FaQ-s}How to file a dispute with Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Attention Dropout · Adam · Cosine Annealing · Linear Warmup With Cosine Annealing · Layer Normalization
