TRScore: A Novel GPT-based Readability Scorer for ASR Segmentation and   Punctuation model evaluation and selection

Piyush Behre; Sharman Tan; Amy Shah; Harini Kesavamoorthy; Shuangyu; Chang; Fei Zuo; Chris Basoglu; Sayan Pathak

arXiv:2210.15104·cs.CL·October 28, 2022

TRScore: A Novel GPT-based Readability Scorer for ASR Segmentation and Punctuation model evaluation and selection

Piyush Behre, Sharman Tan, Amy Shah, Harini Kesavamoorthy, Shuangyu, Chang, Fei Zuo, Chris Basoglu, Sayan Pathak

PDF

Open Access

TL;DR

TRScore is a GPT-based readability metric for evaluating ASR segmentation and punctuation that correlates well with human judgments and reduces reliance on costly human transcriptions.

Contribution

The paper introduces TRScore, a novel GPT-based readability measure for ASR output that correlates with human assessments and improves model evaluation efficiency.

Findings

01

TRScore correlates with human readability scores (Pearson's r=0.98).

02

TRScore correlates with F1 scores (Pearson's r=0.67).

03

TRScore eliminates the need for human transcriptions in model selection.

Abstract

Punctuation and Segmentation are key to readability in Automatic Speech Recognition (ASR), often evaluated using F1 scores that require high-quality human transcripts and do not reflect readability well. Human evaluation is expensive, time-consuming, and suffers from large inter-observer variability, especially in conversational speech devoid of strict grammatical structures. Large pre-trained models capture a notion of grammatical structure. We present TRScore, a novel readability measure using the GPT model to evaluate different segmentation and punctuation systems. We validate our approach with human experts. Additionally, our approach enables quantitative assessment of text post-processing techniques such as capitalization, inverse text normalization (ITN), and disfluency on overall readability, which traditional word error rate (WER) and slot error rate (SER) metrics fail to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Topic Modeling

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · fail · Layer Normalization · Cosine Annealing · Byte Pair Encoding · Residual Connection · Dropout · Weight Decay