Combining LLM decision and RL action selection to improve RL policy for adaptive interventions
Karine Karine, Benjamin M. Marlin

TL;DR
This paper proposes a hybrid approach combining Large Language Models and reinforcement learning to enhance personalized adaptive health interventions by incorporating real-time user preferences through text-based inputs.
Contribution
The study introduces a novel hybrid method that uses LLM responses to filter RL action selection, improving personalization in healthcare interventions.
Findings
The approach effectively incorporates user preferences into RL policies.
It improves personalization in adaptive health interventions.
Simulation results show enhanced policy performance.
Abstract
Reinforcement learning (RL) is increasingly being used in the healthcare domain, particularly for the development of personalized health adaptive interventions. Inspired by the success of Large Language Models (LLMs), we are interested in using LLMs to update the RL policy in real time, with the goal of accelerating personalization. We use the text-based user preference to influence the action selection on the fly, in order to immediately incorporate the user preference. We use the term "user preference" as a broad term to refer to a user personal preference, constraint, health status, or a statement expressing like or dislike, etc. Our novel approach is a hybrid method that combines the LLM response and the RL action selection to improve the RL policy. Given an LLM prompt that incorporates the user preference, the LLM acts as a filter in the typical RL action selection. We investigate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
