Simultaneous Dimension Reduction and Clustering via the NMF-EM Algorithm
L\'ena Carel, Pierre Alquier

TL;DR
This paper introduces the NMF-EM algorithm, combining nonnegative matrix factorization with EM to improve clustering and dimension reduction in non-Gaussian mixture models, demonstrated on public transport data.
Contribution
It proposes a novel constraint for non-Gaussian mixture models and integrates NMF into the EM algorithm for simultaneous parameter estimation and dimension reduction.
Findings
Effective clustering of public transport data
Interpretability of timetable slots as dictionary elements
Open-source R package implementation
Abstract
Mixture models are among the most popular tools for clustering. However, when the dimension and the number of clusters is large, the estimation of the clusters become challenging, as well as their interpretation. Restriction on the parameters can be used to reduce the dimension. An example is given by mixture of factor analyzers for Gaussian mixtures. The extension of MFA to non-Gaussian mixtures is not straightforward. We propose a new constraint for parameters in non-Gaussian mixture model: the components parameters are combinations of elements from a small dictionary, say elements, with . Including a nonnegative matrix factorization (NMF) in the EM algorithm allows us to simultaneously estimate the dictionary and the parameters of the mixture. We propose the acronym NMF-EM for this algorithm, implemented in the R package {\tt nmfem}. This original approach is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Text and Document Classification Technologies · Advanced Clustering Algorithms Research
