Dictionary Learning with Few Samples and Matrix Concentration

Kyle Luh; Van Vu

arXiv:1503.08854·math.PR·April 2, 2015·IEEE Trans. Inf. Theory·1 cites

Dictionary Learning with Few Samples and Matrix Concentration

Kyle Luh, Van Vu

PDF

Open Access

TL;DR

This paper improves the sample complexity bound for recovering a sparse matrix factorization, showing that near-optimal sample size suffices using advanced concentration inequalities and union bound techniques.

Contribution

It proves that $p \,\geq\, C n \log^4 n$ samples are enough for recovery, approaching the conjectured optimal bound up to a polylogarithmic factor.

Findings

01

Achieved near-optimal sample complexity for dictionary learning.

02

Developed new concentration inequalities for random matrices.

03

Introduced refined Bernstein's inequality and union bound techniques.

Abstract

Let $A$ be an $n \times n$ matrix, $X$ be an $n \times p$ matrix and $Y = A X$ . A challenging and important problem in data analysis, motivated by dictionary learning and other practical problems, is to recover both $A$ and $X$ , given $Y$ . Under normal circumstances, it is clear that this problem is underdetermined. However, in the case when $X$ is sparse and random, Spielman, Wang and Wright showed that one can recover both $A$ and $X$ efficiently from $Y$ with high probability, given that $p$ (the number of samples) is sufficiently large. Their method works for $p \geq C n^{2} lo g^{2} n$ and they conjectured that $p \geq C n lo g n$ suffices. The bound $n lo g n$ is sharp for an obvious information theoretical reason. In this paper, we show that $p \geq C n lo g^{4} n$ suffices, matching the conjectural bound up to a polylogarithmic factor. The core of our proof is a theorem concerning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Machine Learning and Algorithms · Distributed Sensor Networks and Detection Algorithms