A Generative Product-of-Filters Model of Audio
Dawen Liang, Matthew D. Hoffman, Gautham J. Mysore

TL;DR
The paper introduces the product-of-filters (PoF) model, a novel generative approach for decomposing audio spectra into sparse filter combinations, enabling improved unsupervised feature extraction and audio processing tasks.
Contribution
It formulates a new PoF model that replaces traditional hand-designed spectral decompositions with learned, statistically inferred filters.
Findings
PoF effectively performs bandwidth expansion in audio spectra.
PoF serves as a useful unsupervised feature extractor for speaker identification.
The model demonstrates promising results in audio processing applications.
Abstract
We propose the product-of-filters (PoF) model, a generative model that decomposes audio spectra as sparse linear combinations of "filters" in the log-spectral domain. PoF makes similar assumptions to those used in the classic homomorphic filtering approach to signal processing, but replaces hand-designed decompositions built of basic signal processing operations with a learned decomposition based on statistical inference. This paper formulates the PoF model and derives a mean-field method for posterior inference and a variational EM algorithm to estimate the model's free parameters. We demonstrate PoF's potential for audio processing on a bandwidth expansion task, and show that PoF can serve as an effective unsupervised feature extractor for a speaker identification task.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Acoustic Wave Phenomena Research
