ASTER: Automatic Speech Recognition System Accessibility Testing for   Stutterers

Yi Liu; Yuekang Li; Gelei Deng; Felix Juefei-Xu; Yao Du; Cen Zhang,; Chengwei Liu; Yeting Li; Lei Ma; Yang Liu

arXiv:2308.15742·cs.SD·August 31, 2023·1 cites

ASTER: Automatic Speech Recognition System Accessibility Testing for Stutterers

Yi Liu, Yuekang Li, Gelei Deng, Felix Juefei-Xu, Yao Du, Cen Zhang,, Chengwei Liu, Yeting Li, Lei Ma, Yang Liu

PDF

Open Access

TL;DR

This paper introduces ASTER, a novel method for automatically generating diverse, realistic stuttering speech to test and analyze the accessibility of ASR systems, revealing their failure modes and improving inclusivity.

Contribution

ASTER is the first framework that automatically generates valid, diverse stuttering speech samples for testing ASR accessibility, using multi-objective optimization to improve test quality.

Findings

01

ASTER significantly increases error rates in tested ASR systems.

02

Generated stuttering speech is indistinguishable from real-world clips.

03

The framework effectively exposes ASR failures related to stuttering.

Abstract

The popularity of automatic speech recognition (ASR) systems nowadays leads to an increasing need for improving their accessibility. Handling stuttering speech is an important feature for accessible ASR systems. To improve the accessibility of ASR systems for stutterers, we need to expose and analyze the failures of ASR systems on stuttering speech. The speech datasets recorded from stutterers are not diverse enough to expose most of the failures. Furthermore, these datasets lack ground truth information about the non-stuttered text, rendering them unsuitable as comprehensive test suites. Therefore, a methodology for generating stuttering speech as test inputs to test and analyze the performance of ASR systems is needed. However, generating valid test inputs in this scenario is challenging. The reason is that although the generated test inputs should mimic how stutterers speak, they…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and dialogue systems · Stuttering Research and Treatment