Reproducible Machine Learning-based Voice Pathology Detection: Introducing the Pitch Difference Feature
Jan Vrba, Jakub Steinbach, Tom\'a\v{s} Jirsa, Laura Verde, Roberta De, Fazio, Yuwen Zeng, Kei Ichiji, Luk\'a\v{s} H\'ajek, Zuzana Sedl\'akov\'a,, Zuzana Urb\'aniov\'a, Martin Chovanec, Jan Mare\v{s}, Noriyasu Homma

TL;DR
This paper presents a reproducible machine learning methodology for voice pathology detection using a novel pitch difference feature and publicly available data, achieving high recall rates across genders.
Contribution
It introduces the pitch difference and NaN features, along with a comprehensive evaluation framework, enhancing reproducibility and effectiveness in voice pathology detection.
Findings
Achieved approximately 85.6% unweighted average recall (UAR)
Validated the effectiveness of novel features in pathology detection
Provided a publicly available code repository for reproducibility
Abstract
Purpose: We introduce a novel methodology for voice pathology detection using the publicly available Saarbr\"ucken Voice Database (SVD) and a robust feature set combining commonly used acoustic handcrafted features with two novel ones: pitch difference (relative variation in fundamental frequency) and NaN feature (failed fundamental frequency estimation). Methods: We evaluate six machine learning (ML) algorithms -- support vector machine, k-nearest neighbors, naive Bayes, decision tree, random forest, and AdaBoost -- using grid search for feasible hyperparameters and 20480 different feature subsets. Top 1000 classification models -- feature subset combinations for each ML algorithm are validated with repeated stratified cross-validation. To address class imbalance, we apply K-Means SMOTE to augment the training data. Results: Our approach achieves 85.61%, 84.69% and 85.22%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis
MethodsSynthetic Minority Over-sampling Technique. · Support Vector Machine
