Task-specific Optimization of Virtual Channel Linear Prediction-based   Speech Dereverberation Front-End for Far-Field Speaker Verification

Joon-Young Yang; Joon-Hyuk Chang

arXiv:2112.13569·eess.AS·December 28, 2021·IEEE ACM Trans. Audio Speech Lang. Process.

Task-specific Optimization of Virtual Channel Linear Prediction-based Speech Dereverberation Front-End for Far-Field Speaker Verification

Joon-Young Yang, Joon-Hyuk Chang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a task-specific optimized VACE-WPE front-end for speech dereverberation that improves far-field speaker verification by explicitly training to cancel reverberation and noise, outperforming neural methods.

Contribution

The study presents a novel task-specific optimization framework for VACE-WPE, enhancing speech dereverberation for speaker verification in challenging acoustic environments.

Findings

01

VACE-WPE outperforms conventional WPE in dereverberation and noise cancellation.

02

Fine-tuning within a task-specific framework improves speaker verification accuracy.

03

The method generalizes well to in-the-wild scenarios beyond controlled conditions.

Abstract

Developing a single-microphone speech denoising or dereverberation front-end for robust automatic speaker verification (ASV) in noisy far-field speaking scenarios is challenging. To address this problem, we present a novel front-end design that involves a recently proposed extension of the weighted prediction error (WPE) speech dereverberation algorithm, the virtual acoustic channel expansion (VACE)-WPE. It is demonstrated experimentally in this study that unlike the conventional WPE algorithm, the VACE-WPE can be explicitly trained to cancel out both late reverberation and background noise. To build the front-end, the VACE-WPE is first independently (pre)trained to produce "noisy" dereverberated signals. Subsequently, given a pretrained speaker embedding model, the VACE-WPE is additionally fine-tuned within a task-specific optimization (TSO) framework, causing the speaker embedding…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dreadbird06/tso_vace_wpe
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing