Are Large Language Models Dynamic Treatment Planners? An In Silico Study from a Prior Knowledge Injection Angle
Zhiyao Luo, Tingting Zhu

TL;DR
This study evaluates large language models as zero-shot insulin dosing agents in a simulated clinical setting, comparing their performance to trained reinforcement learning agents and highlighting their potential and limitations.
Contribution
It demonstrates that open-source LLMs can perform comparably or better than trained RL agents in insulin dosing tasks through prompt engineering, revealing both opportunities and challenges.
Findings
Small LLMs achieve similar or better performance than trained RL agents.
LLMs show limitations like aggressive dosing and reasoning errors.
Prompt engineering and hybrid models are needed for safe clinical use.
Abstract
Reinforcement learning (RL)-based dynamic treatment regimes (DTRs) hold promise for automating complex clinical decision-making, yet their practical deployment remains hindered by the intensive engineering required to inject clinical knowledge and ensure patient safety. Recent advancements in large language models (LLMs) suggest a complementary approach, where implicit prior knowledge and clinical heuristics are naturally embedded through linguistic prompts without requiring environment-specific training. In this study, we rigorously evaluate open-source LLMs as dynamic insulin dosing agents in an in silico Type 1 diabetes simulator, comparing their zero-shot inference performance against small neural network-based RL agents (SRAs) explicitly trained for the task. Our results indicate that carefully designed zero-shot prompts enable smaller LLMs (e.g., Qwen2.5-7B) to achieve comparable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
