HeightCeleb - an enrichment of VoxCeleb dataset with speaker height   information

Stanis{\l}aw Kacprzak; Konrad Kowalczyk

arXiv:2410.12668·cs.SD·January 22, 2025

HeightCeleb - an enrichment of VoxCeleb dataset with speaker height information

Stanis{\l}aw Kacprzak, Konrad Kowalczyk

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper introduces HeightCeleb, a new dataset extending VoxCeleb with speaker height information, enabling improved speaker height estimation using existing speaker recognition models and simple regression techniques.

Contribution

The creation of HeightCeleb dataset with automated height annotations for VoxCeleb speakers, facilitating research in speaker height estimation without additional model training.

Findings

01

Achieved state-of-the-art height estimation results on TIMIT using HeightCeleb data.

02

Demonstrated that pre-trained speaker embeddings can be effectively used for height prediction.

03

Showed that simple regression methods suffice for accurate height estimation.

Abstract

Prediction of speaker's height is of interest for voice forensics, surveillance, and automatic speaker profiling. Until now, TIMIT has been the most popular dataset for training and evaluation of the height estimation methods. In this paper, we introduce HeightCeleb, an extension to VoxCeleb, which is the dataset commonly used in speaker recognition tasks. This enrichment consists in adding information about the height of all 1251 speakers from VoxCeleb that has been extracted with an automated method from publicly available sources. Such annotated data will enable the research community to utilize freely available speaker embedding extractors, pre-trained on VoxCeleb, to build more efficient speaker height estimators. In this work, we describe the creation of the HeightCeleb dataset and show that using it enables to achieve state-of-the-art results on the TIMIT test set by using simple…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

stachu86/HeightCeleb
noneOfficial

Datasets

stachu86/HeightCeleb
dataset· 10 dl
10 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis