Refining Automatic Speech Recognition System for older adults
Liu Chen, Meysam Asgari

TL;DR
This paper develops an improved automatic speech recognition system tailored for seniors aged 80+ with limited training data, using transfer learning and attention mechanisms to enhance performance.
Contribution
It introduces a transfer learning approach combined with attention mechanisms specifically designed to improve ASR accuracy for older adults with minimal data.
Findings
Transfer learning significantly boosts ASR performance for seniors.
Attention mechanisms further improve recognition accuracy.
Achieved 1.58% absolute improvement over baseline transfer learning model.
Abstract
Building a high quality automatic speech recognition (ASR) system with limited training data has been a challenging task particularly for a narrow target population. Open-sourced ASR systems, trained on sufficient data from adults, are susceptible on seniors' speech due to acoustic mismatch between adults and seniors. With 12 hours of training data, we attempt to develop an ASR system for socially isolated seniors (80+ years old) with possible cognitive impairments. We experimentally identify that ASR for the adult population performs poorly on our target population and transfer learning (TL) can boost the system's performance. Standing on the fundamental idea of TL, tuning model parameters, we further improve the system by leveraging an attention mechanism to utilize the model's intermediate information. Our approach achieves 1.58% absolute improvements over the TL model.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Topic Modeling · Speech and dialogue systems
