Conformal Bandits: Bringing statistical validity and reward efficiency to the small-gap regime
Simone Cuonzo, Nina Deliu

TL;DR
Conformal Bandits integrate conformal prediction into bandit algorithms to provide finite-sample statistical guarantees and improve reward efficiency in small-gap regimes, especially in financial portfolio applications.
Contribution
This paper introduces Conformal Bandits, a novel framework that combines conformal prediction with bandit algorithms to ensure statistical validity and enhance performance in small-gap scenarios.
Findings
Conformal Bandits achieve better regret in small-gap regimes.
They provide finite-sample coverage guarantees unlike classical policies.
Enhanced exploration-exploitation with hidden Markov models improves financial returns.
Abstract
We introduce Conformal Bandits, a novel framework integrating Conformal Prediction (CP) into bandit problems, a classic paradigm for sequential decision-making under uncertainty. Traditional regret-minimisation bandit strategies like Thompson Sampling and Upper Confidence Bound (UCB) typically rely on distributional assumptions or asymptotic guarantees; further, they remain largely focused on regret, neglecting their statistical properties. We address this gap. Through the adoption of CP, we bridge the regret-minimising potential of a decision-making bandit policy with statistical guarantees in the form of finite-time prediction coverage. We demonstrate the potential of it Conformal Bandits through simulation studies and an application to portfolio allocation, a typical small-gap regime, where differences in arm rewards are far too small for classical policies to achieve optimal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Risk and Portfolio Optimization · Stochastic Gradient Optimization Techniques
