Multi-Armed Sampling Problem and the End of Exploration

Mohammad Pedramfar; Siamak Ravanbakhsh

arXiv:2507.10797·cs.LG·May 14, 2026

Multi-Armed Sampling Problem and the End of Exploration

Mohammad Pedramfar, Siamak Ravanbakhsh

PDF

TL;DR

This paper introduces multi-armed sampling as a framework to analyze exploration in sampling tasks, revealing that sampling requires minimal exploration compared to optimization, with implications for reinforcement learning and neural sampling.

Contribution

It establishes a formal framework for multi-armed sampling, defines regret notions, provides near-optimal algorithms, and connects sampling with bandit problems through a unifying temperature parameter.

Findings

01

Sampling requires little to no exploration for near-optimal performance.

02

The framework unifies sampling and bandit problems via a temperature parameter.

03

Results have implications for entropy-regularized reinforcement learning and neural samplers.

Abstract

This paper introduces the framework of multi-armed sampling, which serves as the sampling counterpart to the optimization problem of multi-armed bandits. Our primary motivation is to rigorously examine the exploration-exploitation trade-off in the context of sampling. We systematically define plausible notions of regret for this framework and establish corresponding lower bounds. We then propose a simple algorithm that achieves near-optimal regret bounds. Our theoretical results suggest that, in contrast to optimization, sampling barely requires any exploration. To further connect our findings with those of multi-armed bandits, we define a continuous family of problems and associated regret measures that smoothly interpolate and unify multi-armed sampling and multi-armed bandit problems using a temperature parameter. We believe that the multi-armed sampling framework and our findings in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.