Ensemble of classifiers for speech evaluation

G. Belokrylov; A. Korenev; B. Lodonova; A. Novokhrestov

arXiv:2501.00067·cs.SD·January 3, 2025

Ensemble of classifiers for speech evaluation

G. Belokrylov, A. Korenev, B. Lodonova, A. Novokhrestov

PDF

Open Access

TL;DR

This paper explores an ensemble of classifiers to evaluate speech quality in medical applications, using multiple metrics and expert assessments to improve classification accuracy.

Contribution

It introduces an ensemble approach combining five classifiers for speech assessment, demonstrating slight accuracy improvements over individual models.

Findings

01

Ensemble method slightly outperforms individual classifiers.

02

Multiple distance-based metrics are effective features.

03

Support vector machine and ensemble methods show promising results.

Abstract

The article describes an attempt to apply an ensemble of binary classifiers to solve the problem of speech assessment in medicine. A dataset was compiled based on quantitative and expert assessments of syllable pronunciation quality. Quantitative assessments of 7 selected metrics were used as features: dynamic time warp distance, Minkowski distance, correlation coefficient, longest common subsequence (LCSS), edit distance of real se-quence (EDR), edit distance with real penalty (ERP), and merge split (MSM). Expert as-sessment of pronunciation quality was used as a class label: class 1 means high-quality speech, class 0 means distorted. A comparison of training results was carried out for five classification methods: logistic regression (LR), support vector machine (SVM), naive Bayes (NB), decision trees (DT), and K-nearest neighbors (KNN). The results of using the mixture method to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis

MethodsLogistic Regression