LaGR-SEQ: Language-Guided Reinforcement Learning with Sample-Efficient   Querying

Thommen George Karimpanal; Laknath Buddhika Semage; Santu Rana; Hung; Le; Truyen Tran; Sunil Gupta; Svetha Venkatesh

arXiv:2308.13542·cs.AI·August 29, 2023

LaGR-SEQ: Language-Guided Reinforcement Learning with Sample-Efficient Querying

Thommen George Karimpanal, Laknath Buddhika Semage, Santu Rana, Hung, Le, Truyen Tran, Sunil Gupta, Svetha Venkatesh

PDF

Open Access 1 Repo

TL;DR

LaGR-SEQ leverages large language models to guide reinforcement learning efficiently by training a secondary agent to decide when to query the LLM, reducing costs and improving training efficiency in sequential decision tasks.

Contribution

This work introduces a novel framework combining language-guided RL with sample-efficient querying, optimizing LLM usage during RL training.

Findings

01

Enhanced RL training efficiency with fewer LLM queries

02

Effective secondary RL agent for query decision-making

03

Demonstrated improvements on sequential decision tasks

Abstract

Large language models (LLMs) have recently demonstrated their impressive ability to provide context-aware responses via text. This ability could potentially be used to predict plausible solutions in sequential decision making tasks pertaining to pattern completion. For example, by observing a partial stack of cubes, LLMs can predict the correct sequence in which the remaining cubes should be stacked by extrapolating the observed patterns (e.g., cube sizes, colors or other attributes) in the partial stack. In this work, we introduce LaGR (Language-Guided Reinforcement learning), which uses this predictive ability of LLMs to propose solutions to tasks that have been partially completed by a primary reinforcement learning (RL) agent, in order to subsequently guide the latter's training. However, as RL training is generally not sample-efficient, deploying this approach would inherently…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gkthom/lagrseq
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Software Engineering Research · Natural Language Processing Techniques