STARLING: Self-supervised Training of Text-based Reinforcement Learning   Agent with Large Language Models

Shreyas Basavatia; Keerthiram Murugesan; Shivam Ratnakar

arXiv:2406.05872·cs.LG·June 11, 2024

STARLING: Self-supervised Training of Text-based Reinforcement Learning Agent with Large Language Models

Shreyas Basavatia, Keerthiram Murugesan, Shivam Ratnakar

PDF

Open Access 1 Repo 1 Video

TL;DR

STARLING introduces an automated, self-supervised environment for training and testing text-based reinforcement learning agents using large language models, enhancing generalization and skill transfer in interactive fiction games.

Contribution

The paper presents a novel automated framework, STARLING, that generates diverse text-based games using GPT-3 to improve RL agent training and evaluation.

Findings

01

Current RL agents struggle to transfer skills to new situations.

02

STARLING enables scalable generation of diverse training environments.

03

Experimental results show potential for self-supervised RL in text-based games.

Abstract

Interactive fiction games have emerged as an important application to improve the generalization capabilities of language-based reinforcement learning (RL) agents. Existing environments for interactive fiction games are domain-specific or time-consuming to generate and do not train the RL agents to master a specific set of skills. In this work, we introduce an interactive environment for self-supervised RL, STARLING, for text-based games that bootstraps the text-based RL agents with automatically generated games (based on the seed set of game ideas) to boost the performance and generalization capabilities to reach a goal of the target environment. These games let the agent hone their skills on a predefined set of tasks. We create and test an environment with 100 games, generated using this automated framework that uses large language models (GPT-3) and an interactive fiction game engine…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ibm/starling-agent
noneOfficial

Videos

STARLING: Self-supervised Training of Text-based Reinforcement Learning Agent with Large Language Models· underline

Taxonomy

TopicsTopic Modeling

MethodsSparse Evolutionary Training