Blind Acoustic Parameter Estimation Through Task-Agnostic Embeddings Using Latent Approximations
Philipp G\"otz, Cagdas Tuna, Andreas Brendel, Andreas Walther and, Emanu\"el A. P. Habets

TL;DR
This paper introduces a three-stage method using latent representations and auto-encoders for blind acoustic parameter estimation from reverberant speech, outperforming baseline models.
Contribution
It proposes a novel approach combining variational auto-encoders and task-agnostic speech embeddings for acoustic parameter estimation.
Findings
Outperforms end-to-end baseline models.
Effective in estimating acoustic parameters from reverberant speech.
Uses latent representations to improve estimation accuracy.
Abstract
We present a method for blind acoustic parameter estimation from single-channel reverberant speech. The method is structured into three stages. In the first stage, a variational auto-encoder is trained to extract latent representations of acoustic impulse responses represented as mel-spectrograms. In the second stage, a separate speech encoder is trained to estimate low-dimensional representations from short segments of reverberant speech. Finally, the pre-trained speech encoder is combined with a small regression model and evaluated on two parameter regression tasks. Experimentally, the proposed method is shown to outperform a fully end-to-end trained baseline model.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
