# VoiceID Loss: Speech Enhancement for Speaker Verification

**Authors:** Suwon Shon, Hao Tang, James Glass

arXiv: 1904.03601 · 2019-07-08

## TL;DR

This paper introduces VoiceID loss, a new training method for speech enhancement that improves speaker verification robustness by using feedback from a verification model to generate targeted masks, outperforming traditional loss functions.

## Contribution

The paper presents VoiceID loss, a novel loss function that leverages speaker verification feedback to enhance speech signals specifically for verification tasks.

## Key findings

- Improved speaker verification accuracy in noisy environments.
- Enhanced model's ability to ignore noise-dominated spectrogram components.
- Consistent performance gains on both clean and noisy data.

## Abstract

In this paper, we propose VoiceID loss, a novel loss function for training a speech enhancement model to improve the robustness of speaker verification. In contrast to the commonly used loss functions for speech enhancement such as the L2 loss, the VoiceID loss is based on the feedback from a speaker verification model to generate a ratio mask. The generated ratio mask is multiplied pointwise with the original spectrogram to filter out unnecessary components for speaker verification. In the experiments, we observed that the enhancement network, after training with the VoiceID loss, is able to ignore a substantial amount of time-frequency bins, such as those dominated by noise, for verification. The resulting model consistently improves the speaker verification system on both clean and noisy conditions.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.03601/full.md

## Figures

19 figures with captions in the complete paper: https://tomesphere.com/paper/1904.03601/full.md

## References

23 references — full list in the complete paper: https://tomesphere.com/paper/1904.03601/full.md

---
Source: https://tomesphere.com/paper/1904.03601