Demographic Attributes Prediction from Speech Using WavLM Embeddings

Yuchen Yang; Thomas Thebaud; Najim Dehak

arXiv:2502.12007·cs.CL·February 18, 2025

Demographic Attributes Prediction from Speech Using WavLM Embeddings

Yuchen Yang, Thomas Thebaud, Najim Dehak

PDF

Open Access

TL;DR

This paper presents a WavLM-based classifier that accurately predicts demographic attributes from speech, improving performance over existing models and supporting applications like language learning and digital forensics.

Contribution

Introduces a novel demographic prediction framework using pretrained WavLM embeddings, achieving state-of-the-art accuracy and robustness across multiple datasets.

Findings

01

Achieves 4.94 MAE in age prediction

02

Over 99.81% accuracy in gender classification

03

Improves existing models by up to 30% in MAE

Abstract

This paper introduces a general classifier based on WavLM features, to infer demographic characteristics, such as age, gender, native language, education, and country, from speech. Demographic feature prediction plays a crucial role in applications like language learning, accessibility, and digital forensics, enabling more personalized and inclusive technologies. Leveraging pretrained models for embedding extraction, the proposed framework identifies key acoustic and linguistic fea-tures associated with demographic attributes, achieving a Mean Absolute Error (MAE) of 4.94 for age prediction and over 99.81% accuracy for gender classification across various datasets. Our system improves upon existing models by up to relative 30% in MAE and up to relative 10% in accuracy and F1 scores across tasks, leveraging a diverse range of datasets and large pretrained models to ensure robustness and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuthorship Attribution and Profiling · Speech Recognition and Synthesis

MethodsMasked autoencoder