Spoken SQuAD: A Study of Mitigating the Impact of Speech Recognition   Errors on Listening Comprehension

Chia-Hsuan Li; Szu-Lin Wu; Chi-Liang Liu; Hung-yi Lee

arXiv:1804.00320·cs.CL·April 3, 2018·28 cites

Spoken SQuAD: A Study of Mitigating the Impact of Speech Recognition Errors on Listening Comprehension

Chia-Hsuan Li, Szu-Lin Wu, Chi-Liang Liu, Hung-yi Lee

PDF

Open Access 3 Repos 2 Datasets

TL;DR

This paper introduces Spoken SQuAD, a new spoken content comprehension task, highlighting the severe impact of speech recognition errors and proposing mitigation strategies to improve machine understanding of spoken language.

Contribution

The paper presents a novel spoken comprehension task and analyzes the impact of speech recognition errors, proposing methods to mitigate their effects.

Findings

01

Speech recognition errors significantly impair comprehension accuracy

02

Proposed mitigation approaches reduce error impact

03

Spoken SQuAD enables evaluation of spoken language understanding

Abstract

Reading comprehension has been widely studied. One of the most representative reading comprehension tasks is Stanford Question Answering Dataset (SQuAD), on which machine is already comparable with human. On the other hand, accessing large collections of multimedia or spoken content is much more difficult and time-consuming than plain text content for humans. It's therefore highly attractive to develop machines which can automatically understand spoken content. In this paper, we propose a new listening comprehension task - Spoken SQuAD. On the new task, we found that speech recognition errors have catastrophic impact on machine comprehension, and several approaches are proposed to mitigate the impact.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications