Toward Real-World Voice Disorder Classification

Heng-Cheng Kuo; Yu-Peng Hsieh; Huan-Hsin Tseng; Chi-Te Wang; Shih-Hau; Fang; and Yu Tsao

arXiv:2112.02538·eess.AS·April 27, 2023

Toward Real-World Voice Disorder Classification

Heng-Cheng Kuo, Yu-Peng Hsieh, Huan-Hsin Tseng, Chi-Te Wang, Shih-Hau, Fang, and Yu Tsao

PDF

Open Access

TL;DR

This paper presents a compact, resource-efficient voice disorder classification system that employs domain adversarial training to improve robustness in noisy real-world environments, achieving high accuracy with minimal resource use.

Contribution

It introduces a novel system combining factorized CNNs and domain adversarial training to address domain mismatch and resource constraints in voice disorder classification.

Findings

01

13% improvement in unweighted average recall in noisy environments

02

80% recall maintained in clinical domain with slight degradation

03

Reduced memory and computation by over 73.9%

Abstract

Objective: Voice disorders significantly compromise individuals' ability to speak in their daily lives. Without early diagnosis and treatment, these disorders may deteriorate drastically. Thus, automatic classification systems at home are desirable for people who are inaccessible to clinical disease assessments. However, the performance of such systems may be weakened due to the constrained resources and domain mismatch between the clinical data and noisy real-world data. Methods: This study develops a compact and domain-robust voice disorder classification system to identify the utterances of health, neoplasm, and benign structural diseases. Our proposed system utilizes a feature extractor model composed of factorized convolutional neural networks and subsequently deploys domain adversarial training to reconcile the domain mismatch by extracting domain invariant features. Results: The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVoice and Speech Disorders · Speech Recognition and Synthesis