Mathematics of statistical sequential decision-making: concentration, risk-awareness and modelling in stochastic bandits, with applications to bariatric surgery
Patrick Saux

TL;DR
This thesis advances mathematical methods for risk-aware, safe sequential decision-making in healthcare, specifically for postoperative patient follow-up, by developing new concentration bounds, frameworks, algorithms, and interpretable models.
Contribution
It introduces new concentration bounds, a risk-aware bandit framework, nonparametric algorithms, and an interpretable model for personalized bariatric surgery follow-up.
Findings
New concentration bounds for bandit algorithms.
A risk-aware framework for complex decision-making.
An interpretable model for long-term weight prediction.
Abstract
This thesis aims to study some of the mathematical challenges that arise in the analysis of statistical sequential decision-making algorithms for postoperative patients follow-up. Stochastic bandits (multiarmed, contextual) model the learning of a sequence of actions (policy) by an agent in an uncertain environment in order to maximise observed rewards. To learn optimal policies, bandit algorithms have to balance the exploitation of current knowledge and the exploration of uncertain actions. Such algorithms have largely been studied and deployed in industrial applications with large datasets, low-risk decisions and clear modelling assumptions, such as clickthrough rate maximisation in online advertising. By contrast, digital health recommendations call for a whole new paradigm of small samples, risk-averse agents and complex, nonparametric modelling. To this end, we developed new safe,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research
