Improving Word Recognition in Speech Transcriptions by Decision-level   Fusion of Stemming and Two-way Phoneme Pruning

Sunakshi Mehra; Seba Susan

arXiv:2107.12428·cs.CL·July 28, 2021

Improving Word Recognition in Speech Transcriptions by Decision-level Fusion of Stemming and Two-way Phoneme Pruning

Sunakshi Mehra, Seba Susan

PDF

TL;DR

This paper presents an unsupervised method that combines stemming and two-way phoneme pruning at the decision level to significantly improve speech transcription accuracy on the LRW dataset.

Contribution

It introduces a novel unsupervised fusion approach of stemming and phoneme pruning that enhances word recognition in speech transcripts.

Findings

01

Baseline accuracy improved from 9.34% to 23.34%.

02

Decision-level fusion increased accuracy to 32.96%.

03

Method effectively enhances transcription accuracy in video datasets.

Abstract

We introduce an unsupervised approach for correcting highly imperfect speech transcriptions based on a decision-level fusion of stemming and two-way phoneme pruning. Transcripts are acquired from videos by extracting audio using Ffmpeg framework and further converting audio to text transcript using Google API. In the benchmark LRW dataset, there are 500 word categories, and 50 videos per class in mp4 format. All videos consist of 29 frames (each 1.16 s long) and the word appears in the middle of the video. In our approach we tried to improve the baseline accuracy from 9.34% by using stemming, phoneme extraction, filtering and pruning. After applying the stemming algorithm to the text transcript and evaluating the results, we achieved 23.34% accuracy in word recognition. To convert words to phonemes we used the Carnegie Mellon University (CMU) pronouncing dictionary that provides a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPruning