Deep Learning-based automated classification of Chinese Speech Sound Disorders
Yao-Ming Kuo, Shanq-Jang Ruan, Yu-Chin Chen, Ya-Wen Tu

TL;DR
This paper presents a deep learning system that classifies Chinese speech sound disorders in children using acoustic features and neural networks, achieving over 74% accuracy in distinguishing four disorder types.
Contribution
It introduces a novel application of neural networks with data augmentation for Chinese SSD classification, utilizing a comprehensive speech corpus and multiple feature extraction techniques.
Findings
Achieved 74.4% accuracy in classifying four SSD types.
Demonstrated effectiveness of data augmentation in improving model performance.
Validated the system's ability to assist in clinical diagnosis of Chinese speech disorders.
Abstract
This article describes a system for analyzing acoustic data to assist in the diagnosis and classification of children's speech sound disorders (SSDs) using a computer. The analysis concentrated on identifying and categorizing four distinct types of Chinese SSDs. The study collected and generated a speech corpus containing 2540 stopping, backing, final consonant deletion process (FCDP), and affrication samples from 90 children aged 3--6 years with normal or pathological articulatory features. Each recording was accompanied by a detailed diagnostic annotation by two speech-language pathologists (SLPs). Classification of the speech samples was accomplished using three well-established neural network models for image classification. The feature maps were created using three sets of Mel-frequency cepstral coefficients (MFCC) parameters extracted from speech sounds and aggregated into a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
