A sampling algorithm to compute the set of feasible solutions for non-negative matrix factorization with an arbitrary rank
Ragnhild Laursen, Asger Hobolth

TL;DR
This paper introduces a new sampling algorithm to compute the Set of Feasible Solutions in non-negative matrix factorization, applicable to arbitrary ranks, addressing non-uniqueness issues especially relevant in cancer genomics.
Contribution
The paper presents an easy-to-implement sampling algorithm for NMF that works for any rank, unlike existing methods limited to rank ≤ 4, and demonstrates its effectiveness on mutational data.
Findings
The algorithm performs comparably to polygon inflation for low ranks.
The size of the SFS influences the variability of solutions.
The method is applicable to real-world cancer genomics data.
Abstract
Non-negative Matrix Factorization (NMF) is a useful method to extract features from multivariate data, but an important and sometimes neglected concern is that NMF can result in non-unique solutions. Often, there exist a Set of Feasible Solutions (SFS), which makes it more difficult to interpret the factorization. This problem is especially ignored in cancer genomics, where NMF is used to infer information about the mutational processes present in the evolution of cancer. In this paper the extent of non-uniqueness is investigated for two mutational counts data, and a new sampling algorithm, that can find the SFS, is introduced. Our sampling algorithm is easy to implement and applies to an arbitrary rank of NMF. This is in contrast to state of the art, where the NMF rank must be smaller than or equal to four. For lower ranks we show that our algorithm performs similarly to the polygon…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Genetic Mapping and Diversity in Plants and Animals · Bioinformatics and Genomic Networks
