NoRefER: a Referenceless Quality Metric for Automatic Speech Recognition   via Semi-Supervised Language Model Fine-Tuning with Contrastive Learning

Kamer Ali Yuksel; Thiago Ferreira; Golara Javadi; Mohamed; El-Badrashiny; Ahmet Gunduz

arXiv:2306.12577·cs.CL·June 23, 2023·1 cites

NoRefER: a Referenceless Quality Metric for Automatic Speech Recognition via Semi-Supervised Language Model Fine-Tuning with Contrastive Learning

Kamer Ali Yuksel, Thiago Ferreira, Golara Javadi, Mohamed, El-Badrashiny, Ahmet Gunduz

PDF

Open Access 1 Repo

TL;DR

NoRefER is a semi-supervised, contrastive learning-based language model that accurately assesses ASR hypothesis quality without needing reference transcripts, enabling efficient model comparison and error detection.

Contribution

It introduces a novel semi-supervised, contrastive learning approach for referenceless ASR quality evaluation using a multilingual language model.

Findings

01

High correlation with reference-based metrics

02

Effective intra-sample hypothesis ranking

03

Potential for referenceless ASR evaluation

Abstract

This paper introduces NoRefER, a novel referenceless quality metric for automatic speech recognition (ASR) systems. Traditional reference-based metrics for evaluating ASR systems require costly ground-truth transcripts. NoRefER overcomes this limitation by fine-tuning a multilingual language model for pair-wise ranking ASR hypotheses using contrastive learning with Siamese network architecture. The self-supervised NoRefER exploits the known quality relationships between hypotheses from multiple compression levels of an ASR for learning to rank intra-sample hypotheses by quality, which is essential for model comparisons. The semi-supervised version also uses a referenced dataset to improve its inter-sample quality ranking, which is crucial for selecting potentially erroneous samples. The results indicate that NoRefER correlates highly with reference-based metrics and their intra-sample…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aixplain/NoRefER
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Natural Language Processing Techniques

MethodsContrastive Learning · Siamese Network