Dictionary Learning with Few Samples and Matrix Concentration
Kyle Luh, Van Vu

TL;DR
This paper improves the sample complexity bound for recovering a sparse matrix factorization, showing that near-optimal sample size suffices using advanced concentration inequalities and union bound techniques.
Contribution
It proves that $p \,\geq\, C n \log^4 n$ samples are enough for recovery, approaching the conjectured optimal bound up to a polylogarithmic factor.
Findings
Achieved near-optimal sample complexity for dictionary learning.
Developed new concentration inequalities for random matrices.
Introduced refined Bernstein's inequality and union bound techniques.
Abstract
Let be an matrix, be an matrix and . A challenging and important problem in data analysis, motivated by dictionary learning and other practical problems, is to recover both and , given . Under normal circumstances, it is clear that this problem is underdetermined. However, in the case when is sparse and random, Spielman, Wang and Wright showed that one can recover both and efficiently from with high probability, given that (the number of samples) is sufficiently large. Their method works for and they conjectured that suffices. The bound is sharp for an obvious information theoretical reason. In this paper, we show that suffices, matching the conjectural bound up to a polylogarithmic factor. The core of our proof is a theorem concerning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Machine Learning and Algorithms · Distributed Sensor Networks and Detection Algorithms
