Estimating Uniqueness of I-Vector Representation of Human Voice

Erkam Sinan Tandogan; Husrev Taha Sencar

arXiv:2008.11985·eess.AS·March 4, 2021

Estimating Uniqueness of I-Vector Representation of Human Voice

Erkam Sinan Tandogan, Husrev Taha Sencar

PDF

Open Access

TL;DR

This paper investigates the uniqueness of i-vector speech representations, introducing a new entropy-based measure that accounts for speaker variation, and analyzes factors affecting voice distinctiveness using large datasets.

Contribution

It proposes a novel entropy-based uniqueness measure for i-vectors that considers speaker variability and validates it on extensive datasets, showing factors influencing voice distinctiveness.

Findings

01

Discretization does not impair speaker verification performance.

02

Voice representation can be uniquely identified with 43-70 bits of information.

03

Longer speech samples increase the distinctiveness of i-vector representations.

Abstract

We study the individuality of the human voice with respect to a widely used feature representation of speech utterances, namely, the i-vector model. As a first step toward this goal, we compare and contrast uniqueness measures proposed for different biometric modalities. Then, we introduce a new uniqueness measure that evaluates the entropy of i-vectors while taking into account speaker level variations. Our measure operates in the discrete feature space and relies on accurate estimation of the distribution of i-vectors. Therefore, i-vectors are quantized while ensuring that both the quantized and original representations yield similar speaker verification performance. Uniqueness estimates are obtained from two newly generated datasets and the public VoxCeleb dataset. The first custom dataset contains more than one and a half million speech samples of 20,741 speakers obtained from TEDx…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing