Evaluating Automatic Speech Recognition Systems for Korean Meteorological Experts
ChaeHun Park, Hojun Cho, Jaegul Choo

TL;DR
This study evaluates and improves automatic speech recognition systems tailored for Korean meteorological experts by creating a domain-specific dataset and applying data augmentation techniques.
Contribution
The paper introduces a new Korean weather domain dataset, evaluates various multilingual ASR models, and proposes a TTS-based data augmentation method to enhance recognition of specialized terminology.
Findings
Augmentation improved domain-specific term recognition
Multilingual models showed performance limitations in specialized vocabulary
Data augmentation maintained general-domain ASR performance
Abstract
This paper explores integrating Automatic Speech Recognition (ASR) into natural language query systems to improve weather forecasting efficiency for Korean meteorologists. We address challenges in developing ASR systems for the Korean weather domain, specifically specialized vocabulary and Korean linguistic intricacies. To tackle these issues, we constructed an evaluation dataset of spoken queries recorded by native Korean speakers. Using this dataset, we assessed various configurations of a multilingual ASR model family, identifying performance limitations related to domain-specific terminology. We then implemented a simple text-to-speech-based data augmentation method, which improved the recognition of specialized terms while maintaining general-domain performance. Our contributions include creating a domain-specific dataset, comprehensive ASR model evaluations, and an effective…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDiverse Approaches in Healthcare and Education Studies · Marine and Coastal Research · Technology and Data Analysis
