Guiding Pretraining in Reinforcement Learning with Large Language Models

Yuqing Du; Olivia Watkins; Zihan Wang; C\'edric Colas; Trevor Darrell,; Pieter Abbeel; Abhishek Gupta; Jacob Andreas

arXiv:2302.06692·cs.LG·September 18, 2023·39 cites

Guiding Pretraining in Reinforcement Learning with Large Language Models

Yuqing Du, Olivia Watkins, Zihan Wang, C\'edric Colas, Trevor Darrell,, Pieter Abbeel, Abhishek Gupta, Jacob Andreas

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces ELLM, a method that leverages large language models to guide reinforcement learning exploration by rewarding agents for achieving goals suggested by text-based prompts, improving coverage of meaningful behaviors.

Contribution

The paper presents a novel approach that uses large-scale language model pretraining to shape exploration in reinforcement learning without human intervention.

Findings

01

ELLM improves coverage of common-sense behaviors during pretraining.

02

ELLM-trained agents match or outperform baseline methods on downstream tasks.

03

The approach effectively guides agents toward human-meaningful behaviors.

Abstract

Reinforcement learning algorithms typically struggle in the absence of a dense, well-shaped reward function. Intrinsically motivated exploration methods address this limitation by rewarding agents for visiting novel states or transitions, but these methods offer limited benefits in large environments where most discovered novelty is irrelevant for downstream tasks. We describe a method that uses background knowledge from text corpora to shape exploration. This method, called ELLM (Exploring with LLMs) rewards an agent for achieving goals suggested by a language model prompted with a description of the agent's current state. By leveraging large-scale language model pretraining, ELLM guides agents toward human-meaningful and plausibly useful behaviors without requiring a human in the loop. We evaluate ELLM in the Crafter game environment and the Housekeep robotic simulator, showing that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yuqingd/ellm
pytorchOfficial

Videos

Guiding Pretraining in Reinforcement Learning with Large Language Models· slideslive

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques