SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise

Yuejie Li; Ke Yang; Yueying Hua; Berlin Chen; Jianhao Nie; Yueping He; Caixin Kang

arXiv:2602.12783·cs.IR·May 14, 2026

SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise

Yuejie Li, Ke Yang, Yueying Hua, Berlin Chen, Jianhao Nie, Yueping He, Caixin Kang

PDF

1 Repo 1 Datasets

TL;DR

SQuTR is a comprehensive benchmark dataset and evaluation protocol designed to assess and improve the robustness of spoken query to text retrieval systems under various noisy acoustic conditions.

Contribution

It introduces a large-scale, multi-domain dataset with synthesized speech and environmental noise, enabling reproducible robustness evaluation for retrieval systems.

Findings

01

Performance drops significantly with increased noise levels.

02

Different retrieval systems exhibit varying robustness to noise.

03

Even large models struggle under extreme noise conditions.

Abstract

Spoken query retrieval is an important interaction mode in modern information retrieval. However, existing evaluation datasets are often limited to simple queries under constrained noise conditions, making them inadequate for assessing the robustness of spoken query retrieval systems under complex acoustic perturbations. To address this limitation, we present SQuTR, a robustness benchmark for spoken query retrieval that includes a large-scale dataset and a unified evaluation protocol. SQuTR aggregates 37,317 unique queries from six commonly used English and Chinese text retrieval datasets, spanning multiple domains and diverse query types. We synthesize speech using voice profiles from 200 real speakers and mix 17 categories of real-world environmental noise under controlled SNR levels, enabling reproducible robustness evaluation from quiet to highly noisy conditions. Under the unified…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ttoyekk1a/SQuTR-Spoken-Query-to-Text-Retrieval
github

Datasets

SLLMCommunity/SQuTR
dataset· 178 dl
178 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.