SD-QA: Spoken Dialectal Question Answering for the Real World

Fahim Faisal; Sharlina Keshava; Md Mahfuz ibn Alam; Antonios; Anastasopoulos

arXiv:2109.12072·cs.CL·September 27, 2021

SD-QA: Spoken Dialectal Question Answering for the Real World

Fahim Faisal, Sharlina Keshava, Md Mahfuz ibn Alam, Antonios, Anastasopoulos

PDF

1 Repo 1 Datasets

TL;DR

This paper introduces SD-QA, a multi-dialect spoken question answering benchmark across five languages, addressing real-world speech recognition errors and dialectal variations, and analyzing model fairness and performance.

Contribution

It creates a new multi-dialect spoken QA dataset for five languages, incorporating speech recognition errors and dialectal diversity, and provides baseline evaluations and fairness analysis.

Findings

01

Baseline results reveal the impact of dialect and speaker attributes on QA performance.

02

The dataset exposes challenges in speech recognition and QA systems for dialectal and multilingual settings.

03

Analysis shows disparities in model performance across different user populations.

Abstract

Question answering (QA) systems are now available through numerous commercial applications for a wide variety of domains, serving millions of users that interact with them via speech interfaces. However, current benchmarks in QA research do not account for the errors that speech recognition models might introduce, nor do they consider the language variations (dialects) of the users. To address this gap, we augment an existing QA dataset to construct a multi-dialect, spoken QA benchmark on five languages (Arabic, Bengali, English, Kiswahili, Korean) with more than 68k audio prompts in 24 dialects from 255 speakers. We provide baseline results showcasing the real-world performance of QA systems and analyze the effect of language variety and other sensitive speaker attributes on downstream performance. Last, we study the fairness of the ASR and QA models with respect to the underlying user…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ffaisal93/sd-qa
noneOfficial

Datasets

meituan-longcat/UNO-Bench
dataset· 2.0k dl
2.0k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.