Apple Tasting Revisited: Bayesian Approaches to Partially Monitored Online Binary Classification
James A. Grant, David S. Leslie

TL;DR
This paper explores Bayesian methods, especially Thompson Sampling, for the online binary classification 'apple tasting' problem with partial label observations, demonstrating improved theoretical regret bounds and empirical performance.
Contribution
It introduces a Bayesian approach using Thompson Sampling with Pólya-Gamma augmentation for the partial monitoring problem, achieving better regret bounds and empirical results.
Findings
Thompson Sampling attains improved Bayesian regret bounds.
Pólya-Gamma augmentation enhances approximation efficiency.
Bayesian methods outperform existing approaches in experiments.
Abstract
We consider a variant of online binary classification where a learner sequentially assigns labels ( or ) to items with unknown true class. If, but only if, the learner chooses label they immediately observe the true label of the item. The learner faces a trade-off between short-term classification accuracy and long-term information gain. This problem has previously been studied under the name of the `apple tasting' problem. We revisit this problem as a partial monitoring problem with side information, and focus on the case where item features are linked to true classes via a logistic regression model. Our principal contribution is a study of the performance of Thompson Sampling (TS) for this problem. Using recently developed information-theoretic tools, we show that TS achieves a Bayesian regret bound of an improved order to previous approaches. Further, we experimentally…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Machine Learning and Algorithms
MethodsSpatio-temporal stability analysis · Logistic Regression
