Identifiability of Complete Dictionary Learning

J\'er\'emy E. Cohen; Nicolas Gillis

arXiv:1808.08765·stat.ML·September 20, 2019

Identifiability of Complete Dictionary Learning

J\'er\'emy E. Cohen, Nicolas Gillis

PDF

TL;DR

This paper establishes new deterministic bounds for the sample complexity needed for unique recovery in complete dictionary learning, especially when data has low-rank structure, improving over previous combinatorial bounds.

Contribution

It provides the first non-asymptotic sample bounds for identifiability in deterministic low-rank dictionary learning scenarios, reducing sample complexity from combinatorial to polynomial in r.

Findings

01

Sample complexity bound of O(r^3/(r-k)^2) for identifiability.

02

Necessary lower bounds that challenge previous assumptions.

03

Constant proportion of zeros in B requires only O(r) samples for recovery.

Abstract

Sparse component analysis (SCA), also known as complete dictionary learning, is the following problem: Given an input matrix $M$ and an integer $r$ , find a dictionary $D$ with $r$ columns and a matrix $B$ with $k$ -sparse columns (that is, each column of $B$ has at most $k$ non-zero entries) such that $M \approx D B$ . A key issue in SCA is identifiability, that is, characterizing the conditions under which $D$ and $B$ are essentially unique (that is, they are unique up to permutation and scaling of the columns of $D$ and rows of $B$ ). Although SCA has been vastly investigated in the last two decades, only a few works have tackled this issue in the deterministic scenario, and no work provides reasonable bounds in the minimum number of samples (that is, columns of $M$ ) that leads to identifiability. In this work, we provide new results in the deterministic scenario when the data has a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.