# An End-to-End Text-independent Speaker Verification Framework with a   Keyword Adversarial Network

**Authors:** Sungrack Yun, Janghoon Cho, Jungyun Eum, Wonil Chang, Kyuwoong Hwang

arXiv: 1908.02612 · 2019-08-08

## TL;DR

This paper introduces an end-to-end, text-independent speaker verification system that combines speaker embedding learning with adversarial training against an ASR network to improve discrimination and reduce text dependency.

## Contribution

It proposes a novel framework integrating triplet loss and adversarial training with an ASR network for more robust, text-independent speaker verification.

## Key findings

- Lower equal error rate compared to existing methods
- Enhanced text-independency of speaker embeddings
- Effective on LibriSpeech and CHiME 2013 datasets

## Abstract

This paper presents an end-to-end text-independent speaker verification framework by jointly considering the speaker embedding (SE) network and automatic speech recognition (ASR) network. The SE network learns to output an embedding vector which distinguishes the speaker characteristics of the input utterance, while the ASR network learns to recognize the phonetic context of the input. In training our speaker verification framework, we consider both the triplet loss minimization and adversarial gradient of the ASR network to obtain more discriminative and text-independent speaker embedding vectors. With the triplet loss, the distances between the embedding vectors of the same speaker are minimized while those of different speakers are maximized. Also, with the adversarial gradient of the ASR network, the text-dependency of the speaker embedding vector can be reduced. In the experiments, we evaluated our speaker verification framework using the LibriSpeech and CHiME 2013 dataset, and the evaluation results show that our speaker verification framework shows lower equal error rate and better text-independency compared to the other approaches.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.02612/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/1908.02612/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/1908.02612/full.md

---
Source: https://tomesphere.com/paper/1908.02612