On the Encoding of Gender in Transformer-based ASR Representations

Aravind Krishnan; Badr M. Abdullah; Dietrich Klakow

arXiv:2406.09855·cs.CL·October 8, 2024·1 cites

On the Encoding of Gender in Transformer-based ASR Representations

Aravind Krishnan, Badr M. Abdullah, Dietrich Klakow

PDF

Open Access 1 Repo

TL;DR

This paper investigates how gender information is encoded in transformer-based ASR models, demonstrating that gender can be erased with minimal impact on performance and revealing where gender information is concentrated within the models.

Contribution

It introduces a method to remove gender information from ASR models and analyzes the encoding of gender in their latent representations, highlighting potential for gender-neutral embeddings.

Findings

01

Gender information is concentrated in the first and last frames of final layers.

02

Removing gender information has minimal impact on ASR performance.

03

Gender can be effectively erased from model representations using linear methods.

Abstract

While existing literature relies on performance differences to uncover gender biases in ASR models, a deeper analysis is essential to understand how gender is encoded and utilized during transcript generation. This work investigates the encoding and utilization of gender in the latent representations of two transformer-based ASR models, Wav2Vec2 and HuBERT. Using linear erasure, we demonstrate the feasibility of removing gender information from each layer of an ASR model and show that such an intervention has minimal impacts on the ASR performance. Additionally, our analysis reveals a concentration of gender information within the first and last frames in the final layers, explaining the ease of erasing gender in these layers. Our findings suggest the prospect of creating gender-neutral embeddings that can be integrated into ASR frameworks without compromising their efficacy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Krishnan-Aravind/Interspeech_2024_Gender
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsUltrasonics and Acoustic Wave Propagation · Speech and Audio Processing · Fault Detection and Control Systems