TL;DR
This paper investigates how speech recognition errors impact passage retrieval in spoken question answering systems, highlighting the importance of robustness in retrieval models against ASR noise across different languages and real user data.
Contribution
It introduces synthetic ASR noise to existing datasets, evaluates retrieval robustness, and presents a new dataset with human-voiced questions to analyze real-world ASR effects.
Findings
Lexical and dense retrievers are affected by ASR errors.
Data augmentation improves robustness across domains.
Natural ASR noise degrades retrieval more than synthetic noise.
Abstract
Interacting with a speech interface to query a Question Answering (QA) system is becoming increasingly popular. Typically, QA systems rely on passage retrieval to select candidate contexts and reading comprehension to extract the final answer. While there has been some attention to improving the reading comprehension part of QA systems against errors that automatic speech recognition (ASR) models introduce, the passage retrieval part remains unexplored. However, such errors can affect the performance of passage retrieval, leading to inferior end-to-end performance. To address this gap, we augment two existing large-scale passage ranking and open domain QA datasets with synthetic ASR noise and study the robustness of lexical and dense retrievers against questions with ASR noise. Furthermore, we study the generalizability of data augmentation techniques across different domains; with each…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
