The Evolving Landscape of LLM- and VLM-Integrated Reinforcement Learning

Sheila Schoepp; Masoud Jafaripour; Yingyue Cao; Tianpei Yang; and Fatemeh Abdollahi; Shadan Golestan; Zahin Sufiyan; Osmar R.; Zaiane; Matthew E. Taylor

arXiv:2502.15214·cs.LG·February 24, 2025

The Evolving Landscape of LLM- and VLM-Integrated Reinforcement Learning

Sheila Schoepp, Masoud Jafaripour, Yingyue Cao, Tianpei Yang, and Fatemeh Abdollahi, Shadan Golestan, Zahin Sufiyan, Osmar R., Zaiane, Matthew E. Taylor

PDF

TL;DR

This survey reviews how Large Language Models and Vision-Language Models are integrated into reinforcement learning to address challenges like knowledge gaps and planning, proposing a taxonomy and future research directions.

Contribution

It introduces a taxonomy categorizing LLM/VLM-assisted RL approaches into agent, planner, and reward roles, and consolidates current research with future challenges.

Findings

01

Categorizes LLM/VLM roles in RL as agent, planner, reward

02

Identifies open problems like grounding and bias mitigation

03

Provides a framework for future integration of multimodal models in RL

Abstract

Reinforcement learning (RL) has shown impressive results in sequential decision-making tasks. Meanwhile, Large Language Models (LLMs) and Vision-Language Models (VLMs) have emerged, exhibiting impressive capabilities in multimodal understanding and reasoning. These advances have led to a surge of research integrating LLMs and VLMs into RL. In this survey, we review representative works in which LLMs and VLMs are used to overcome key challenges in RL, such as lack of prior knowledge, long-horizon planning, and reward design. We present a taxonomy that categorizes these LLM/VLM-assisted RL approaches into three roles: agent, planner, and reward. We conclude by exploring open problems, including grounding, bias mitigation, improved representations, and action advice. By consolidating existing research and identifying future directions, this survey establishes a framework for integrating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.