Generalized Identifiability Bounds for Mixture Models with Grouped   Samples

Robert A. Vandermeulen; Ren\'e Saitenmacher

arXiv:2207.11164·math.ST·July 25, 2022

Generalized Identifiability Bounds for Mixture Models with Grouped Samples

Robert A. Vandermeulen, Ren\'e Saitenmacher

PDF

Open Access

TL;DR

This paper extends identifiability bounds for mixture models, showing that under linear independence of component subsets, fewer samples per group are needed for identifiability, with implications for multinomial and topic models.

Contribution

It generalizes existing identifiability bounds by relating the number of samples per group to the linear independence of component subsets.

Findings

01

Identifiability achieved with fewer samples per group under linear independence.

02

Lower bounds on sample size per group for identifiability and determinedness.

03

Randomly chosen components from a k-dimensional space almost surely satisfy the independence condition.

Abstract

Recent work has shown that finite mixture models with $m$ components are identifiable, while making no assumptions on the mixture components, so long as one has access to groups of samples of size $2 m - 1$ which are known to come from the same mixture component. In this work we generalize that result and show that, if every subset of $k$ mixture components of a mixture model are linearly independent, then that mixture model is identifiable with only $(2 m - 1) / (k - 1)$ samples per group. We further show that this value cannot be improved. We prove an analogous result for a stronger form of identifiability known as "determinedness" along with a corresponding lower bound. This independence assumption almost surely holds if mixture components are chosen randomly from a $k$ -dimensional space. We describe some implications of our results for multinomial mixture models and topic modeling.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Methods and Mixture Models · Machine Learning and Algorithms · Bayesian Modeling and Causal Inference