DYSTIL: Dynamic Strategy Induction with Large Language Models for   Reinforcement Learning

Borui Wang; Kathleen McKeown; Rex Ying

arXiv:2505.03209·cs.LG·May 7, 2025

DYSTIL: Dynamic Strategy Induction with Large Language Models for Reinforcement Learning

Borui Wang, Kathleen McKeown, Rex Ying

PDF

Open Access

TL;DR

DYSTIL leverages large language models to dynamically generate and internalize strategies in reinforcement learning, significantly improving generalization, sample efficiency, and interpretability in challenging environments.

Contribution

The paper introduces DYSTIL, a novel framework that uses LLMs to induce and internalize strategies, enhancing RL performance and interpretability compared to existing methods.

Findings

01

DYSTIL outperforms baselines by 17.75% in success rate.

02

It achieves higher sample efficiency during training.

03

Provides a textual channel for policy strategy interpretation.

Abstract

Reinforcement learning from expert demonstrations has long remained a challenging research problem, and existing state-of-the-art methods using behavioral cloning plus further RL training often suffer from poor generalization, low sample efficiency, and poor model interpretability. Inspired by the strong reasoning abilities of large language models (LLMs), we propose a novel strategy-based reinforcement learning framework integrated with LLMs called DYnamic STrategy Induction with Llms for reinforcement learning (DYSTIL) to overcome these limitations. DYSTIL dynamically queries a strategy-generating LLM to induce textual strategies based on advantage estimations and expert demonstrations, and gradually internalizes induced strategies into the RL agent through policy optimization to improve its performance through boosting policy generalization and enhancing sample efficiency. It also…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics