A Mutual Contamination Analysis of Mixed Membership and Partial Label Models
Julian Katz-Samuels, Clayton Scott

TL;DR
This paper studies mutual contamination models in machine learning, providing conditions for identifiability, algorithms for decontamination, and novel proof techniques applicable to mixed membership and partial label models.
Contribution
It introduces necessary and sufficient conditions for identifiability and develops algorithms for decontamination in both infinite and finite sample scenarios, using affine geometry.
Findings
Identifiability conditions for mutual contamination models established.
Algorithms for decontamination in mixed membership and partial label models developed.
Novel affine geometry-based proof techniques introduced.
Abstract
Many machine learning problems can be characterized by mutual contamination models. In these problems, one observes several random samples from different convex combinations of a set of unknown base distributions. It is of interest to decontaminate mutual contamination models, i.e., to recover the base distributions either exactly or up to a permutation. This paper considers the general setting where the base distributions are defined on arbitrary probability spaces. We examine the decontamination problem in two mutual contamination models that describe popular machine learning tasks: recovering the base distributions up to a permutation in a mixed membership model, and recovering the base distributions exactly in a partial label model for classification. We give necessary and sufficient conditions for identifiability of both mutual contamination models, algorithms for both problems in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Bayesian Modeling and Causal Inference
