MOS-Bias: From Hidden Gender Bias to Gender-Aware Speech Quality Assessment

Wenze Ren; Yi-Cheng Lin; Wen-Chin Huang; Erica Cooper; Ryandhimas E. Zezario; Hsin-Min Wang; Hung-yi Lee; Yu Tsao

arXiv:2603.10723·eess.AS·March 17, 2026

MOS-Bias: From Hidden Gender Bias to Gender-Aware Speech Quality Assessment

Wenze Ren, Yi-Cheng Lin, Wen-Chin Huang, Erica Cooper, Ryandhimas E. Zezario, Hsin-Min Wang, Hung-yi Lee, Yu Tsao

PDF

Open Access

TL;DR

This paper systematically analyzes gender bias in speech quality assessments, revealing consistent scoring disparities between male and female listeners and proposing a gender-aware model to mitigate this bias.

Contribution

It is the first to analyze gender bias in MOS and introduces a gender-aware model that improves prediction accuracy by learning gender-specific scoring patterns.

Findings

01

Male listeners assign higher scores than female listeners, especially in low-quality speech.

02

Gender bias in MOS predictions can be learned and mitigated with a gender-aware model.

03

Bias diminishes as speech quality improves, but remains systematic and detectable.

Abstract

The Mean Opinion Score (MOS) serves as the standard metric for speech quality assessment, yet biases in human annotations remain underexplored. We conduct the first systematic analysis of gender bias in MOS, revealing that male listeners consistently assign higher scores than female listeners--a gap that is most pronounced in low-quality speech and gradually diminishes as quality improves. This quality-dependent structure proves difficult to eliminate through simple calibration. We further demonstrate that automated MOS models trained on aggregated labels exhibit predictions skewed toward male standards of perception. To address this, we propose a gender-aware model that learns gender-specific scoring patterns through abstracting binary group embeddings, thereby improving overall and gender-specific prediction accuracy. This study establishes that gender bias in MOS constitutes a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Emotion and Mood Recognition