Weak to Strong Learning from Aggregate Labels
Yukti Makhija, Rishi Saket

TL;DR
This paper investigates the potential of boosting weak learners to strong learners in learning from aggregate labels, revealing fundamental limitations in LLP and MIL, but also providing a polynomial-time method for LLP with large bags.
Contribution
It proves the impossibility of boosting in LLP and MIL with weak classifiers, and introduces a polynomial-time algorithm for strong learning in LLP with large bags.
Findings
Boosting is impossible in LLP and MIL with weak classifiers of accuracy less than 1.
A polynomial-time method can convert weak learners into strong learners in LLP with large bags.
Empirical validation on multiple datasets supports the proposed LLP algorithm.
Abstract
In learning from aggregate labels, the training data consists of sets or "bags" of feature-vectors (instances) along with an aggregate label for each bag derived from the (usually {0,1}-valued) labels of its instances. In learning from label proportions (LLP), the aggregate label is the average of the bag's instance labels, whereas in multiple instance learning (MIL) it is the OR. The goal is to train an instance-level predictor, typically achieved by fitting a model on the training data, in particular one that maximizes the accuracy which is the fraction of satisfied bags i.e., those on which the predicted labels are consistent with the aggregate label. A weak learner has at a constant accuracy < 1 on the training bags, while a strong learner's accuracy can be arbitrarily close to 1. We study the problem of using a weak learner on such training bags with aggregate labels to obtain a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Machine Learning and Data Classification
