A variational autoencoder-based nonnegative matrix factorisation model   for deep dictionary learning

Hong-Bo Xie; Caoyuan Li; Shuliang Wang; Richard Yi Da Xu; Kerrie; Mengersen

arXiv:2301.07272·cs.LG·January 19, 2023

A variational autoencoder-based nonnegative matrix factorisation model for deep dictionary learning

Hong-Bo Xie, Caoyuan Li, Shuliang Wang, Richard Yi Da Xu, Kerrie, Mengersen

PDF

Open Access

TL;DR

This paper introduces a novel probabilistic model combining variational autoencoders with nonnegative matrix factorization to learn robust, nonnegative dictionaries for signal processing tasks, outperforming existing methods.

Contribution

It proposes a new VAE-based NMF model with a Gamma-distributed latent space and a specialized loss function for nonnegativity, advancing deep dictionary learning techniques.

Findings

01

VAE-NMF outperforms state-of-the-art methods in dictionary learning.

02

The model effectively enhances speech signals.

03

The approach successfully extracts muscle synergies.

Abstract

Construction of dictionaries using nonnegative matrix factorisation (NMF) has extensive applications in signal processing and machine learning. With the advances in deep learning, training compact and robust dictionaries using deep neural networks, i.e., dictionaries of deep features, has been proposed. In this study, we propose a probabilistic generative model which employs a variational autoencoder (VAE) to perform nonnegative dictionary learning. In contrast to the existing VAE models, we cast the model under a statistical framework with latent variables obeying a Gamma distribution and design a new loss function to guarantee the nonnegative dictionaries. We adopt an acceptance-rejection sampling reparameterization trick to update the latent variables iteratively. We apply the dictionaries learned from VAE-NMF to two signal processing tasks, i.e., enhancement of speech and extraction…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Phonetics and Phonology Research