Shouted Speech Compensation for Speaker Verification Robust to Vocal   Effort Conditions

Santi Prieto; Alfonso Ortega; Iv\'an L\'opez-Espejo; Eduardo Lleida

arXiv:2008.02487·eess.AS·August 7, 2020

Shouted Speech Compensation for Speaker Verification Robust to Vocal Effort Conditions

Santi Prieto, Alfonso Ortega, Iv\'an L\'opez-Espejo, Eduardo Lleida

PDF

TL;DR

This paper introduces a linear compensation method using Gaussian mixture models to improve speaker verification accuracy across different vocal effort conditions, such as shouted versus normal speech.

Contribution

It presents a novel application of GMM-based compensation techniques from speech recognition to address vocal effort mismatch in speaker verification systems.

Findings

01

Up to 13.8% EER relative improvement with compensation

02

Effective shouted speech detection using logistic regression

03

Back-end compensation enhances robustness to vocal effort variations

Abstract

The performance of speaker verification systems degrades when vocal effort conditions between enrollment and test (e.g., shouted vs. normal speech) are different. This is a potential situation in non-cooperative speaker verification tasks. In this paper, we present a study on different methods for linear compensation of embeddings making use of Gaussian mixture models to cluster shouted and normal speech domains. These compensation techniques are borrowed from the area of robustness for automatic speech recognition and, in this work, we apply them to compensate the mismatch between shouted and normal conditions in speaker verification. Before compensation, shouted condition is automatically detected by means of logistic regression. The process is computationally light and it is performed in the back-end of an x-vector system. Experimental results show that applying the proposed approach…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.