TL;DR
This paper introduces diverse dictionary learning, showing that certain set-theoretic structures of latent variables are identifiable under minimal assumptions, enabling more reliable understanding of hidden data.
Contribution
It formalizes the concept of diverse dictionary learning, providing set-theoretic identifiability results and a simple inductive bias applicable to various models.
Findings
Set operations on latent variables are identifiable with minimal assumptions.
Structural diversity enables full identifiability of all latent variables.
The proposed bias improves latent variable recovery on synthetic and real data.
Abstract
Given only observational data , where both the latent variables and the generating process are unknown, recovering is ill-posed without additional assumptions. Existing methods often assume linearity or rely on auxiliary supervision and functional constraints. However, such assumptions are rarely verifiable in practice, and most theoretical guarantees break down under even mild violations, leaving uncertainty about how to reliably understand the hidden world. To make identifiability actionable in the real-world scenarios, we take a complementary view: in the general settings where full identifiability is unattainable, what can still be recovered with guarantees, and what biases could be universally adopted? We introduce the problem of diverse dictionary learning to formalize this view. Specifically, we show that intersections, complements, and symmetric differences…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
