TL;DR
This paper investigates how best-action queries can improve online learning by providing additional information, establishing tight bounds on regret with limited queries under different feedback models.
Contribution
It introduces the concept of best-action queries in online learning and derives tight regret bounds for scenarios with limited query access, highlighting the benefits of even sublinear query budgets.
Findings
In the full feedback model, $k$ queries achieve an optimal regret of Θ(min{√T, T/k}).
With only feedback during query steps, the regret is Θ(min{T/√k, T²/k²}), improving label-efficient prediction.
Even a modest number of queries (Ω(√T)) significantly reduces regret.
Abstract
In online learning, a decision maker repeatedly selects one of a set of actions, with the goal of minimizing the overall loss incurred. Following the recent line of research on algorithms endowed with additional predictive features, we revisit this problem by allowing the decision maker to acquire additional information on the actions to be selected. In particular, we study the power of \emph{best-action queries}, which reveal beforehand the identity of the best action at a given time step. In practice, predictive features may be expensive, so we allow the decision maker to issue at most such queries. We establish tight bounds on the performance any algorithm can achieve when given access to best-action queries for different types of feedback models. In particular, we prove that in the full feedback model, queries are enough to achieve an optimal regret of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
MethodsSparse Evolutionary Training
