TRScore: A Novel GPT-based Readability Scorer for ASR Segmentation and Punctuation model evaluation and selection
Piyush Behre, Sharman Tan, Amy Shah, Harini Kesavamoorthy, Shuangyu, Chang, Fei Zuo, Chris Basoglu, Sayan Pathak

TL;DR
TRScore is a GPT-based readability metric for evaluating ASR segmentation and punctuation that correlates well with human judgments and reduces reliance on costly human transcriptions.
Contribution
The paper introduces TRScore, a novel GPT-based readability measure for ASR output that correlates with human assessments and improves model evaluation efficiency.
Findings
TRScore correlates with human readability scores (Pearson's r=0.98).
TRScore correlates with F1 scores (Pearson's r=0.67).
TRScore eliminates the need for human transcriptions in model selection.
Abstract
Punctuation and Segmentation are key to readability in Automatic Speech Recognition (ASR), often evaluated using F1 scores that require high-quality human transcripts and do not reflect readability well. Human evaluation is expensive, time-consuming, and suffers from large inter-observer variability, especially in conversational speech devoid of strict grammatical structures. Large pre-trained models capture a notion of grammatical structure. We present TRScore, a novel readability measure using the GPT model to evaluate different segmentation and punctuation systems. We validate our approach with human experts. Additionally, our approach enables quantitative assessment of text post-processing techniques such as capitalization, inverse text normalization (ITN), and disfluency on overall readability, which traditional word error rate (WER) and slot error rate (SER) metrics fail to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Topic Modeling
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · fail · Layer Normalization · Cosine Annealing · Byte Pair Encoding · Residual Connection · Dropout · Weight Decay
