Evolutionary optimization of contexts for phonetic correction in speech   recognition systems

Rafael Viana-C\'amara; Diego Campos-Sobrino; Mario Campos-Soberanis

arXiv:2102.11480·eess.AS·February 24, 2021

Evolutionary optimization of contexts for phonetic correction in speech recognition systems

Rafael Viana-C\'amara, Diego Campos-Sobrino, Mario Campos-Soberanis

PDF

Open Access

TL;DR

This paper presents an evolutionary approach using genetic algorithms to optimize context and phonetic correction techniques, significantly reducing speech recognition errors in domain-specific applications.

Contribution

It introduces a novel method combining genetic algorithms with phonetic correction for improving domain-specific speech recognition accuracy.

Findings

01

Genetic algorithms effectively optimize context for speech recognition.

02

Phonetic correction techniques further reduce recognition errors.

03

Combined approach outperforms traditional methods.

Abstract

Automatic Speech Recognition (ASR) is an area of growing academic and commercial interest due to the high demand for applications that use it to provide a natural communication method. It is common for general purpose ASR systems to fail in applications that use a domain-specific language. Various strategies have been used to reduce the error, such as providing a context that modifies the language model and post-processing correction methods. This article explores the use of an evolutionary process to generate an optimized context for a specific application domain, as well as different correction techniques based on phonetic distance metrics. The results show the viability of a genetic algorithm as a tool for context optimization, which, added to a post-processing correction based on phonetic representations, can reduce the errors on the recognized speech.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Advanced Data Compression Techniques