LaGR-SEQ: Language-Guided Reinforcement Learning with Sample-Efficient Querying
Thommen George Karimpanal, Laknath Buddhika Semage, Santu Rana, Hung, Le, Truyen Tran, Sunil Gupta, Svetha Venkatesh

TL;DR
LaGR-SEQ leverages large language models to guide reinforcement learning efficiently by training a secondary agent to decide when to query the LLM, reducing costs and improving training efficiency in sequential decision tasks.
Contribution
This work introduces a novel framework combining language-guided RL with sample-efficient querying, optimizing LLM usage during RL training.
Findings
Enhanced RL training efficiency with fewer LLM queries
Effective secondary RL agent for query decision-making
Demonstrated improvements on sequential decision tasks
Abstract
Large language models (LLMs) have recently demonstrated their impressive ability to provide context-aware responses via text. This ability could potentially be used to predict plausible solutions in sequential decision making tasks pertaining to pattern completion. For example, by observing a partial stack of cubes, LLMs can predict the correct sequence in which the remaining cubes should be stacked by extrapolating the observed patterns (e.g., cube sizes, colors or other attributes) in the partial stack. In this work, we introduce LaGR (Language-Guided Reinforcement learning), which uses this predictive ability of LLMs to propose solutions to tasks that have been partially completed by a primary reinforcement learning (RL) agent, in order to subsequently guide the latter's training. However, as RL training is generally not sample-efficient, deploying this approach would inherently…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Software Engineering Research · Natural Language Processing Techniques
