Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models

Yi-Cheng Lin; Wei-Chih Chen; Hung-yi Lee

arXiv:2408.07665·cs.CL·May 22, 2025

Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models

Yi-Cheng Lin, Wei-Chih Chen, Hung-yi Lee

PDF

Open Access 1 Repo

TL;DR

This paper introduces Spoken Stereoset, a dataset for evaluating social biases in Speech Large Language Models, revealing that most models exhibit minimal bias but some still show stereotypical tendencies.

Contribution

The study presents Spoken Stereoset, the first dataset specifically designed to measure social biases in speech-based large language models.

Findings

01

Most models show minimal bias

02

Some models exhibit stereotypical tendencies

03

Bias levels vary across models

Abstract

Warning: This paper may contain texts with uncomfortable content. Large Language Models (LLMs) have achieved remarkable performance in various tasks, including those involving multimodal data like speech. However, these models often exhibit biases due to the nature of their training data. Recently, more Speech Large Language Models (SLLMs) have emerged, underscoring the urgent need to address these biases. This study introduces Spoken Stereoset, a dataset specifically designed to evaluate social biases in SLLMs. By examining how different models respond to speech from diverse demographic groups, we aim to identify these biases. Our experiments reveal significant insights into their performance and bias levels. The findings indicate that while most models show minimal bias, some still exhibit slightly stereotypical or anti-stereotypical tendencies.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dlion168/spoken_stereoset
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and dialogue systems