Enhancing ASR Performance through OCR Word Frequency Analysis:   Theoretical Foundations

Kyudan Jung; Nam-Joon Kim; Hyun Gon Ryu; Hyuk-Jae Lee

arXiv:2405.02995·math.NA·November 12, 2024

Enhancing ASR Performance through OCR Word Frequency Analysis: Theoretical Foundations

Kyudan Jung, Nam-Joon Kim, Hyun Gon Ryu, Hyuk-Jae Lee

PDF

Open Access

TL;DR

This paper explores how analyzing OCR word frequencies, grounded in power law theory, can enhance automatic speech recognition accuracy for specialized terminology, especially in lecture settings.

Contribution

It introduces a theoretical foundation based on power law for the word frequency difference method to improve ASR performance on specialized terms.

Findings

01

The power law effectively models word frequency differences.

02

The approach improves ASR accuracy for specialized terminology.

03

Experimental results support the theoretical foundation.

Abstract

As the interest in large language models grows, the importance of accuracy in automatic speech recognition has become more pronounced. This is especially true for lectures that include specialized terminology. In such cases, the success rate of traditional ASR models tends to be low, presenting a significant challenge. A method using the word frequency difference approach has been proposed to improve ASR performance for specialized terminology. We investigated this proposal through experiments and data analysis to determine if it effectively addresses the issue. In addition, we introduced the power law as the theoretical foundation for the relative frequency methodology mentioned in this approach.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSimulation and Modeling Applications · Speech Recognition and Synthesis · Internet of Things and Social Network Interactions