Token-Weighted RNN-T for Learning from Flawed Data

Gil Keren; Wei Zhou; Ozlem Kalinli

arXiv:2406.18108·cs.CL·June 27, 2024·1 cites

Token-Weighted RNN-T for Learning from Flawed Data

Gil Keren, Wei Zhou, Ozlem Kalinli

PDF

Open Access

TL;DR

This paper introduces a token-weighted RNN-T criterion that reduces the impact of transcription errors during training, improving speech recognition accuracy especially in semi-supervised and error-prone data scenarios.

Contribution

The paper proposes a novel token-weighted RNN-T objective that mitigates errors from flawed transcriptions, enhancing model robustness in semi-supervised learning and noisy data conditions.

Findings

01

Up to 38% relative accuracy improvement with pseudo-labels.

02

Recovers 64%-99% of accuracy loss from transcription errors.

03

Effective in both semi-supervised and error-prone training settings.

Abstract

ASR models are commonly trained with the cross-entropy criterion to increase the probability of a target token sequence. While optimizing the probability of all tokens in the target sequence is sensible, one may want to de-emphasize tokens that reflect transcription errors. In this work, we propose a novel token-weighted RNN-T criterion that augments the RNN-T objective with token-specific weights. The new objective is used for mitigating accuracy loss from transcriptions errors in the training data, which naturally appear in two settings: pseudo-labeling and human annotation errors. Experiments results show that using our method for semi-supervised learning with pseudo-labels leads to a consistent accuracy improvement, up to 38% relative. We also analyze the accuracy degradation resulting from different levels of WER in the reference transcription, and show that token-weighted RNN-T is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · COVID-19 diagnosis using AI · Adversarial Robustness in Machine Learning