Variational Inference in Non-negative Factorial Hidden Markov Models for   Efficient Audio Source Separation

Gautham Mysore (Adobe Systems); Maneesh Sahani (University College; London)

arXiv:1206.6468·cs.LG·July 3, 2012·ICML·19 cites

Variational Inference in Non-negative Factorial Hidden Markov Models for Efficient Audio Source Separation

Gautham Mysore (Adobe Systems), Maneesh Sahani (University College, London)

PDF

Open Access

TL;DR

This paper introduces a Bayesian variational inference method for non-negative factorial hidden Markov models, significantly improving computational efficiency in audio source separation while maintaining comparable accuracy.

Contribution

A novel variational inference algorithm for N-FHMM that reduces complexity from exponential to linear in the number of sources, enabling faster audio separation.

Findings

01

Achieves around 30x speedup over exact inference.

02

Performs comparably to original N-FHMM in separation quality.

03

Complexity is linear in the number of sound sources.

Abstract

The past decade has seen substantial work on the use of non-negative matrix factorization and its probabilistic counterparts for audio source separation. Although able to capture audio spectral structure well, these models neglect the non-stationarity and temporal dynamics that are important properties of audio. The recently proposed non-negative factorial hidden Markov model (N-FHMM) introduces a temporal dimension and improves source separation performance. However, the factorial nature of this model makes the complexity of inference exponential in the number of sound sources. Here, we present a Bayesian variant of the N-FHMM suited to an efficient variational inference algorithm, whose complexity is linear in the number of sound sources. Our algorithm performs comparably to exact inference in the original N-FHMM but is significantly faster. In typical configurations of the N-FHMM,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Blind Source Separation Techniques · Music and Audio Processing