A Decision-Language Model (DLM) for Dynamic Restless Multi-Armed Bandit Tasks in Public Health

Nikhil Behari; Edwin Zhang; Yunfan Zhao; Aparna Taneja; Dheeraj Nagaraj; Milind Tambe

arXiv:2402.14807·cs.MA·May 29, 2025·3 cites

A Decision-Language Model (DLM) for Dynamic Restless Multi-Armed Bandit Tasks in Public Health

Nikhil Behari, Edwin Zhang, Yunfan Zhao, Aparna Taneja, Dheeraj Nagaraj, Milind Tambe

PDF

Open Access

TL;DR

This paper introduces a Decision Language Model (DLM) that leverages Large Language Models to dynamically adapt and optimize resource allocation policies in public health using human-language commands, enhancing flexibility over traditional RMAB models.

Contribution

The paper presents a novel DLM framework that enables real-time policy fine-tuning in RMABs through natural language prompts and iterative feedback, bridging LLMs with public health resource management.

Findings

01

DLM effectively interprets human policy prompts.

02

DLM can generate and refine reward functions.

03

Simulation shows improved policy shaping with DLM.

Abstract

Restless multi-armed bandits (RMAB) have demonstrated success in optimizing resource allocation for large beneficiary populations in public health settings. Unfortunately, RMAB models lack flexibility to adapt to evolving public health policy priorities. Concurrently, Large Language Models (LLMs) have emerged as adept automated planners across domains of robotic control and navigation. In this paper, we propose a Decision Language Model (DLM) for RMABs, enabling dynamic fine-tuning of RMAB policies in public health settings using human-language commands. We propose using LLMs as automated planners to (1) interpret human policy preference prompts, (2) propose reward functions as code for a multi-agent RMAB environment, and (3) iterate on the generated reward functions using feedback from grounded RMAB simulations. We illustrate the application of DLM in collaboration with ARMMAN, an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMind wandering and attention · AI in Service Interactions · Digital Mental Health Interventions