Stream-based Online Active Learning in a Contextual Multi-Armed Bandit Framework
Linqi Song

TL;DR
This paper introduces a stream-based online active learning algorithm for contextual multi-armed bandits that balances reward maximization with query costs, achieving sublinear regret similar to traditional methods without query costs.
Contribution
It proposes a novel algorithm that refines context and arm spaces and strategically requests reward ground truths, accounting for prior information and variable query costs.
Findings
Regret is proven to be sublinear, matching conventional bandit algorithms.
The algorithm effectively balances exploration and exploitation with query costs.
Partition refinement improves learning accuracy over time.
Abstract
We study the stream-based online active learning in a contextual multi-armed bandit framework. In this framework, the reward depends on both the arm and the context. In a stream-based active learning setting, obtaining the ground truth of the reward is costly, and the conventional contextual multi-armed bandit algorithm fails to achieve a sublinear regret due to this cost. Hence, the algorithm needs to determine whether or not to request the ground truth of the reward at current time slot. In our framework, we consider a stream-based active learning setting in which a query request for the ground truth is sent to the annotator, together with some prior information of the ground truth. Depending on the accuracy of the prior information, the query cost varies. Our algorithm mainly carries out two operations: the refinement of the context and arm spaces and the selection of actions. In our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Data Stream Mining Techniques
