Multiview Canonical Correlation Analysis for Automatic Pathological   Speech Detection

Yacouba Kaloga; Shakeel A. Sheikh; Ina Kodrasi

arXiv:2409.17276·eess.AS·September 27, 2024

Multiview Canonical Correlation Analysis for Automatic Pathological Speech Detection

Yacouba Kaloga, Shakeel A. Sheikh, Ina Kodrasi

PDF

Open Access

TL;DR

This paper introduces a multiview canonical correlation analysis method to improve automatic pathological speech detection by reducing irrelevant uncorrelated information in input representations, leading to better performance and interpretability.

Contribution

The paper proposes using Multiview Canonical Correlation Analysis (MCCA) to enhance pathological speech detection by filtering out uncorrelated information, outperforming other dimensionality reduction techniques.

Findings

01

MCCA significantly improves detection performance.

02

MCCA preserves interpretability of input representations.

03

Traditional classifiers with MCCA match or exceed complex models.

Abstract

Recently proposed automatic pathological speech detection approaches rely on spectrogram input representations or wav2vec2 embeddings. These representations may contain pathology irrelevant uncorrelated information, such as changing phonetic content or variations in speaking style across time, which can adversely affect classification performance. To address this issue, we propose to use Multiview Canonical Correlation Analysis (MCCA) on these input representations prior to automatic pathological speech detection. Our results demonstrate that unlike other dimensionality reduction techniques, the use of MCCA leads to a considerable improvement in pathological speech detection performance by eliminating uncorrelated information present in the input representations. Employing MCCA with traditional classifiers yields a comparable or higher performance than using sophisticated architectures,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing