Risk-inclusive Contextual Bandits for Early Phase Clinical Trials

Rohit Kanrar; Chunlin Li; Zara Ghodsi; Margaret Gamalo

arXiv:2507.22344·stat.ME·February 13, 2026

Risk-inclusive Contextual Bandits for Early Phase Clinical Trials

Rohit Kanrar, Chunlin Li, Zara Ghodsi, Margaret Gamalo

PDF

TL;DR

This paper presents a novel risk-inclusive contextual bandit algorithm for early-phase clinical trials that optimizes drug dosing by balancing safety and efficacy using participant-specific data and advanced statistical methods.

Contribution

It introduces a new algorithm combining dual Thompson samplers and generalized confidence sequences for improved dose allocation in clinical trials.

Findings

01

Outperforms traditional randomized dose allocation methods.

02

Provides uniform coverage guarantees for sequential causal inference.

03

Aligns well with real data from a Phase IIb study.

Abstract

Early-phase clinical trials face the challenge of selecting optimal drug doses that balance safety and efficacy due to uncertain dose-response relationships and varied participant characteristics. Traditional randomized dose allocation often exposes participants to sub-optimal doses by not considering individual covariates, necessitating larger sample sizes and prolonging drug development. This paper introduces a risk-inclusive contextual bandit algorithm that utilizes multi-arm bandit (MAB) strategies to optimize dosing through participant-specific data integration. By combining two separate Thompson samplers, one for efficacy and one for safety, the algorithm enhances the balance between efficacy and safety in dose allocation. The effect sizes are estimated with a generalized version of asymptotic confidence sequences (AsympCS), offering a uniform coverage guarantee for sequential…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.