Voice Pathology Detection Using Phonation

Sri Raksha Siva; Nived Suthahar; Prakash Boominathan; Uma Ranjan

arXiv:2508.07587·cs.CV·August 12, 2025

Voice Pathology Detection Using Phonation

Sri Raksha Siva, Nived Suthahar, Prakash Boominathan, Uma Ranjan

PDF

Open Access

TL;DR

This paper presents a noninvasive, machine learning-based framework utilizing phonation data and acoustic features to accurately detect voice pathologies, aiming to improve early diagnosis and patient outcomes.

Contribution

It introduces a novel combination of acoustic features, RNN models with attention, and data augmentation techniques for voice pathology detection.

Findings

01

High classification accuracy achieved with RNN models.

02

Data augmentation improves model robustness.

03

Scale-based features enhance detection of irregularities.

Abstract

Voice disorders significantly affect communication and quality of life, requiring an early and accurate diagnosis. Traditional methods like laryngoscopy are invasive, subjective, and often inaccessible. This research proposes a noninvasive, machine learning-based framework for detecting voice pathologies using phonation data. Phonation data from the Saarbr\"ucken Voice Database are analyzed using acoustic features such as Mel Frequency Cepstral Coefficients (MFCCs), chroma features, and Mel spectrograms. Recurrent Neural Networks (RNNs), including LSTM and attention mechanisms, classify samples into normal and pathological categories. Data augmentation techniques, including pitch shifting and Gaussian noise addition, enhance model generalizability, while preprocessing ensures signal quality. Scale-based features, such as H\"older and Hurst exponents, further capture signal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVoice and Speech Disorders · Respiratory and Cough-Related Research · Phonocardiography and Auscultation Techniques