Mathematics of statistical sequential decision-making: concentration,   risk-awareness and modelling in stochastic bandits, with applications to   bariatric surgery

Patrick Saux

arXiv:2405.01994·stat.ML·May 6, 2024

Mathematics of statistical sequential decision-making: concentration, risk-awareness and modelling in stochastic bandits, with applications to bariatric surgery

Patrick Saux

PDF

Open Access

TL;DR

This thesis advances mathematical methods for risk-aware, safe sequential decision-making in healthcare, specifically for postoperative patient follow-up, by developing new concentration bounds, frameworks, algorithms, and interpretable models.

Contribution

It introduces new concentration bounds, a risk-aware bandit framework, nonparametric algorithms, and an interpretable model for personalized bariatric surgery follow-up.

Findings

01

New concentration bounds for bandit algorithms.

02

A risk-aware framework for complex decision-making.

03

An interpretable model for long-term weight prediction.

Abstract

This thesis aims to study some of the mathematical challenges that arise in the analysis of statistical sequential decision-making algorithms for postoperative patients follow-up. Stochastic bandits (multiarmed, contextual) model the learning of a sequence of actions (policy) by an agent in an uncertain environment in order to maximise observed rewards. To learn optimal policies, bandit algorithms have to balance the exploitation of current knowledge and the exploration of uncertain actions. Such algorithms have largely been studied and deployed in industrial applications with large datasets, low-risk decisions and clear modelling assumptions, such as clickthrough rate maximisation in online advertising. By contrast, digital health recommendations call for a whole new paradigm of small samples, risk-averse agents and complex, nonparametric modelling. To this end, we developed new safe,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research