Supplementary Resources and Analysis for Automatic Speech Recognition Systems Trained on the Loquacious Dataset

Nick Rossenbach; Robin Schmitt; Tina Raissi; Simon Berger; Larissa Kleppel; Ralf Schl\"uter

arXiv:2512.17915·cs.CL·December 23, 2025

Supplementary Resources and Analysis for Automatic Speech Recognition Systems Trained on the Loquacious Dataset

Nick Rossenbach, Robin Schmitt, Tina Raissi, Simon Berger, Larissa Kleppel, Ralf Schl\"uter

PDF

Open Access

TL;DR

This paper introduces supplementary resources for the Loquacious dataset, including language models and pronunciation tools, and demonstrates their utility across various ASR architectures to enhance benchmarking and analysis.

Contribution

It provides open access to additional resources for the Loquacious dataset and evaluates their effectiveness across multiple ASR models, advancing benchmarking capabilities.

Findings

01

Loquacious dataset is effective for ASR benchmarking

02

Supplementary resources improve ASR performance analysis

03

Dataset covers diverse acoustic and language domains

Abstract

The recently published Loquacious dataset aims to be a replacement for established English automatic speech recognition (ASR) datasets such as LibriSpeech or TED-Lium. The main goal of the Loquacious dataset is to provide properly defined training and test partitions across many acoustic and language domains, with an open license suitable for both academia and industry. To further promote the benchmarking and usability of this new dataset, we present additional resources in the form of n-gram language models (LMs), a grapheme-to-phoneme (G2P) model and pronunciation lexica, with open and public access. Utilizing those additional resources we show experimental results across a wide range of ASR architectures with different label units and topologies. Our initial experimental results indicate that the Loquacious dataset offers a valuable study case for a variety of common challenges in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Voice and Speech Disorders