Application of Multi-Armed Bandits to Model-assisted designs for Dose-Finding Clinical Trials
Masahiro Kojima

TL;DR
This paper explores the use of multi-armed bandit algorithms, including Thompson sampling and its variants, to improve dose-finding in clinical trials through simulation-based evaluation.
Contribution
It introduces regularized Thompson sampling and a posterior median-based method for dose selection, enhancing existing bandit approaches for small sample clinical trials.
Findings
Regularized Thompson sampling improves dose selection accuracy.
Posterior median-based method offers a robust alternative.
Simulation results demonstrate better performance over traditional methods.
Abstract
We consider applying multi-armed bandits to model-assisted designs for dose-finding clinical trials. Multi-armed bandits are very simple and powerful methods to determine actions to maximize a reward in a limited number of trials. Among the multi-armed bandits, we first consider the use of Thompson sampling which determines actions based on random samples from a posterior distribution. In the small sample size, as shown in dose-finding trials, because the tails of posterior distribution are heavier and random samples are too much variability, we also consider an application of regularized Thompson sampling and greedy algorithm. The greedy algorithm determines a dose based on a posterior mean. In addition, we also propose a method to determine a dose based on a posterior median. We evaluate the performance of our proposed designs for six scenarios via simulation studies.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods in Clinical Trials · Advanced Bandit Algorithms Research · Advanced Causal Inference Techniques
