Words as Beacons: Guiding RL Agents with High-Level Language Prompts

Unai Ruiz-Gonzalez; Alain Andres; Pedro G.Bascoy; Javier Del Ser

arXiv:2410.08632·cs.AI·October 14, 2024

Words as Beacons: Guiding RL Agents with High-Level Language Prompts

Unai Ruiz-Gonzalez, Alain Andres, Pedro G.Bascoy, Javier Del Ser

PDF

Open Access

TL;DR

This paper introduces a novel RL framework where Large Language Models act as teachers to guide agents with subgoals, significantly improving exploration and learning speed in sparse reward environments.

Contribution

It proposes a teacher-student RL framework using LLMs for subgoal generation, enabling more efficient exploration without ongoing LLM intervention during deployment.

Findings

01

Accelerates learning by up to 200 times compared to baselines.

02

Uses LLMs to generate subgoals based on environment descriptions.

03

Effective in complex, procedurally generated environments.

Abstract

Sparse reward environments in reinforcement learning (RL) pose significant challenges for exploration, often leading to inefficient or incomplete learning processes. To tackle this issue, this work proposes a teacher-student RL framework that leverages Large Language Models (LLMs) as "teachers" to guide the agent's learning process by decomposing complex tasks into subgoals. Due to their inherent capability to understand RL environments based on a textual description of structure and purpose, LLMs can provide subgoals to accomplish the task defined for the environment in a similar fashion to how a human would do. In doing so, three types of subgoals are proposed: positional targets relative to the agent, object representations, and language-based instructions generated directly by the LLM. More importantly, we show that it is possible to query the LLM only during the training phase,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Agent Systems and Negotiation · Natural Language Processing Techniques · Topic Modeling