Label Propagation for Learning with Label Proportions
Rafael Poyiadzi, Raul Santos-Rodriguez, Niall Twomey

TL;DR
This paper introduces a graph-based algorithm for learning with label proportions, effectively recovering true labels from bag-aggregated data, especially useful in healthcare where individual labeling is costly.
Contribution
It proposes a novel, efficient label propagation method that maintains bag proportions and leverages data structure for improved label recovery.
Findings
The algorithm outperforms existing methods in label accuracy.
It is computationally efficient and scalable.
Demonstrated effectiveness on healthcare datasets.
Abstract
Learning with Label Proportions (LLP) is the problem of recovering the underlying true labels given a dataset when the data is presented in the form of bags. This paradigm is particularly suitable in contexts where providing individual labels is expensive and label aggregates are more easily obtained. In the healthcare domain, it is a burden for a patient to keep a detailed diary of their daily routines, but often they will be amenable to provide higher level summaries of daily behavior. We present a novel and efficient graph-based algorithm that encourages local smoothness and exploits the global structure of the data, while preserving the `mass' of each bag.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Text and Document Classification Technologies
