Analysis of Diagnostics (Part II): Prevalence, Linear Independence, and Unsupervised Learning
Paul N. Patrone, Raquel A. Binder, Catherine S. Forconi, Ann M., Moormann, Anthony J. Kearsley

TL;DR
This paper extends diagnostic testing methods to unsupervised learning, demonstrating how prevalence and linear algebra concepts can identify class distributions without labeled data, with applications to synthetic and real SARS-CoV-2 data.
Contribution
It introduces the concept of linearly independent populations and shows how prevalence can be estimated in unsupervised settings using linear algebra techniques.
Findings
Prevalence estimation is possible in unsupervised learning with linear algebra.
Linearly independent populations have different but unknown prevalence values.
Method demonstrated on synthetic and SARS-CoV-2 ELISA data.
Abstract
This is the second manuscript in a two-part series that uses diagnostic testing to understand the connection between prevalence (i.e. number of elements in a class), uncertainty quantification (UQ), and classification theory. Part I considered the context of supervised machine learning (ML) and established a duality between prevalence and the concept of relative conditional probability. The key idea of that analysis was to train a family of discriminative classifiers by minimizing a sum of prevalence-weighted empirical risk functions. The resulting outputs can be interpreted as relative probability level-sets, which thereby yield uncertainty estimates in the class labels. This procedure also demonstrated that certain discriminative and generative ML models are equivalent. Part II considers the extent to which these results can be extended to tasks in unsupervised learning through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
