Pure Exploration with Infinite Answers
Riccardo Poiani, Martino Bernasconi, Andrea Celli

TL;DR
This paper addresses pure exploration problems with potentially infinite correct answers, deriving lower bounds, analyzing limitations of existing methods, and proposing a new asymptotically optimal framework applicable to broader settings.
Contribution
It introduces the Sticky-Sequence Track-and-Stop framework, generalizing previous methods and achieving asymptotic optimality in infinite-answer pure exploration problems.
Findings
Derived an instance-dependent lower bound for infinite-answer problems
Identified limitations of existing finite-answer methods in the infinite setting
Proposed a new framework that is asymptotically optimal across various scenarios
Abstract
We study pure exploration problems in which the set of correct answers is possibly infinite. For example, such problems arise when regressing a continuous function on the means of the bandit or when learning Nash equilibria by querying noisy values of the payoff matrix. We derive an instance-dependent lower bound for these problems. By analyzing it, we discuss why existing methods (i.e., Sticky Track-and-Stop) for finite answer problems fail at being asymptotically optimal in this more general setting. Finally, we present a framework, Sticky-Sequence Track-and-Stop, which generalizes both Track-and-Stop and Sticky Track-and-Stop, and that enjoys asymptotic optimality. Due to its generality, our analysis also highlights special cases where existing methods enjoy optimality.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems
