# A Deep Neural Network for Short-Segment Speaker Recognition

**Authors:** Amirhossein Hajavi, Ali Etemad

arXiv: 1907.10420 · 2019-07-25

## TL;DR

This paper introduces UtterIdNet, a deep neural network designed specifically for short-segment speaker recognition, achieving superior performance on VoxCeleb benchmarks for very brief speech segments.

## Contribution

The paper presents a novel neural network architecture tailored for short-duration speech, improving recognition accuracy for segments as short as 250 milliseconds.

## Key findings

- Significant performance improvements over previous models for 2-second segments.
- Stable and consistent recognition results for segments as short as 250 ms.
- Effective use of information in short speech segments through the proposed architecture.

## Abstract

Todays interactive devices such as smart-phone assistants and smart speakers often deal with short-duration speech segments. As a result, speaker recognition systems integrated into such devices will be much better suited with models capable of performing the recognition task with short-duration utterances. In this paper, a new deep neural network, UtterIdNet, capable of performing speaker recognition with short speech segments is proposed. Our proposed model utilizes a novel architecture that makes it suitable for short-segment speaker recognition through an efficiently increased use of information in short speech segments. UtterIdNet has been trained and tested on the VoxCeleb datasets, the latest benchmarks in speaker recognition. Evaluations for different segment durations show consistent and stable performance for short segments, with significant improvement over the previous models for segments of 2 seconds, 1 second, and especially sub-second durations (250 ms and 500 ms).

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.10420/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1907.10420/full.md

## References

37 references — full list in the complete paper: https://tomesphere.com/paper/1907.10420/full.md

---
Source: https://tomesphere.com/paper/1907.10420