Investigating the Effect of Label Topology and Training Criterion on ASR Performance and Alignment Quality
Tina Raissi, Christoph L\"uscher, Simon Berger, Ralf Schl\"uter,, Hermann Ney

TL;DR
This paper compares different ASR models focusing on label topology and training criteria, evaluating their alignment quality, word error rate, and efficiency on LibriSpeech and Switchboard datasets.
Contribution
It provides a detailed comparison of discriminative alignment models and first-order label context models under similar conditions, highlighting their relative performance.
Findings
HMM-based models show different alignment qualities compared to RNN transducers.
Word error rates vary significantly across models and datasets.
Real-time factors indicate differences in computational efficiency.
Abstract
The ongoing research scenario for automatic speech recognition (ASR) envisions a clear division between end-to-end approaches and classic modular systems. Even though a high-level comparison between the two approaches in terms of their requirements and (dis)advantages is commonly addressed, a closer comparison under similar conditions is not readily available in the literature. In this work, we present a comparison focused on the label topology and training criterion. We compare two discriminative alignment models with hidden Markov model (HMM) and connectionist temporal classification topology, and two first-order label context ASR models utilizing factored HMM and strictly monotonic recurrent neural network transducer, respectively. We use different measurements for the evaluation of the alignment quality, and compare word error rate and real time factor of our best systems.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Manufacturing and Logistics Optimization
