Optimistic Rates for Learning from Label Proportions
Gene Li, Lin Chen, Adel Javanmard, Vahab Mirrokni

TL;DR
This paper analyzes learning from label proportions, showing that certain loss functions and algorithms achieve fast, optimal rates in both realizable and agnostic settings, advancing weakly supervised classification.
Contribution
It introduces and analyzes optimistic rate guarantees for specific learning rules in LLP, demonstrating their optimal sample complexity in various settings.
Findings
EPRM achieves fast rates under realizability but fails in agnostic cases.
Debiased proportional square loss and EasyLLP achieve optimistic, optimal rates.
Sample complexity is optimal up to log factors in both realizable and agnostic scenarios.
Abstract
We consider a weakly supervised learning problem called Learning from Label Proportions (LLP), where examples are grouped into ``bags'' and only the average label within each bag is revealed to the learner. We study various learning rules for LLP that achieve PAC learning guarantees for classification loss. We establish that the classical Empirical Proportional Risk Minimization (EPRM) learning rule (Yu et al., 2014) achieves fast rates under realizability, but EPRM and similar proportion matching learning rules can fail in the agnostic setting. We also show that (1) a debiased proportional square loss, as well as (2) a recently proposed EasyLLP learning rule (Busa-Fekete et al., 2023) both achieve ``optimistic rates'' (Panchenko, 2002); in both the realizable and agnostic settings, their sample complexity is optimal (up to log factors) in terms of , and VC dimension.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification
