A Sparsity-promoting Dictionary Model for Variational Autoencoders
Mostafa Sadeghi, Paul Magron

TL;DR
This paper introduces a sparsity-promoting dictionary model for VAEs that enhances interpretability and expressiveness without sacrificing reconstruction quality, using a simple, efficient, and tuning-free approach.
Contribution
It proposes a novel sparsity-promoting dictionary model for VAEs with a learnable Gaussian prior, improving interpretability and sparsity without degrading output quality.
Findings
Outperforms competing methods in speech generation tasks
Promotes sparsity without reducing speech quality
Uses a computationally efficient, tuning-free inference scheme
Abstract
Structuring the latent space in probabilistic deep generative models, e.g., variational autoencoders (VAEs), is important to yield more expressive models and interpretable representations, and to avoid overfitting. One way to achieve this objective is to impose a sparsity constraint on the latent variables, e.g., via a Laplace prior. However, such approaches usually complicate the training phase, and they sacrifice the reconstruction quality to promote sparsity. In this paper, we propose a simple yet effective methodology to structure the latent space via a sparsity-promoting dictionary model, which assumes that each latent code can be written as a sparse linear combination of a dictionary's columns. In particular, we leverage a computationally efficient and tuning-free method, which relies on a zero-mean Gaussian latent prior with learnable variances. We derive a variational inference…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Topic Modeling
MethodsVariational Inference
