One Good Source is All You Need: Near-Optimal Regret for Bandits under Heterogeneous Noise
Amith Bhat, Haipeng Luo, Aadirupa Saha

TL;DR
This paper introduces SOAR, an algorithm for multi-armed bandits with multiple data sources of unknown, heterogeneous noise, achieving near-optimal regret by adaptively selecting sources and arms.
Contribution
The paper proposes SOAR, a novel algorithm that efficiently identifies the best data source and arm, achieving optimal regret bounds despite unknown source variances.
Findings
SOAR attains regret close to the ideal single-source scenario with minimal variance.
It outperforms baselines like Uniform UCB and Explore-then-Commit UCB in synthetic and real-world datasets.
Theoretical bounds show significant improvement over existing methods in heterogeneous noise settings.
Abstract
We study -armed Multiarmed Bandit (MAB) problem with heterogeneous data sources, each exhibiting unknown and distinct noise variances . The learner's objective is standard MAB regret minimization, with the additional complexity of adaptively selecting which data source to query from at each round. We propose Source-Optimistic Adaptive Regret minimization (SOAR), a novel algorithm that quickly prunes high-variance sources using sharp variance-concentration bounds, followed by a `balanced min-max LCB-UCB approach' that seamlessly integrates the parallel tasks of identifying the best arm and the optimal (minimum-variance) data source. Our analysis shows SOAR achieves an instance-dependent regret bound of , up to preprocessing costs depending only on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
