Active Learning of General Halfspaces: Label Queries vs Membership Queries
Ilias Diakonikolas, Daniel M. Kane, Mingchen Ma

TL;DR
This paper investigates the limits of active learning for general halfspaces under Gaussian distributions, showing that label queries alone require exponentially large unlabeled pools, but membership queries can significantly reduce complexity.
Contribution
It establishes lower bounds for label-only active learning and introduces efficient algorithms using membership queries, highlighting a separation between the two query models.
Findings
Active label-only learning needs exponentially large unlabeled pools.
Membership queries enable efficient learning with fewer samples.
A strong separation exists between label-only and membership query models.
Abstract
We study the problem of learning general (i.e., not necessarily homogeneous) halfspaces under the Gaussian distribution on in the presence of some form of query access. In the classical pool-based active learning model, where the algorithm is allowed to make adaptive label queries to previously sampled points, we establish a strong information-theoretic lower bound ruling out non-trivial improvements over the passive setting. Specifically, we show that any active learner requires label complexity of , where is the number of unlabeled examples. Specifically, to beat the passive label complexity of , an active learner requires a pool of unlabeled samples. On the positive side, we show that this lower bound can be circumvented with membership query access, even in the agnostic model. Specifically, we give…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · DNA and Biological Computing · semigroups and automata theory
