Automatic dysarthric speech detection exploiting pairwise distance-based convolutional neural networks
P. Janbakhshi, I. Kodrasi, H. Bourlard

TL;DR
This paper introduces a novel CNN-based method for dysarthric speech detection that analyzes pairwise distance matrices of articulatory features, achieving high accuracy across multiple languages and pathologies.
Contribution
It proposes a new approach using pairwise distance matrices and CNNs for dysarthric speech detection, optimized end-to-end, outperforming existing CNN baselines.
Findings
High detection accuracy on multiple language databases
Outperforms other CNN-based methods
Effective across different speech pathologies
Abstract
Automatic dysarthric speech detection can provide reliable and cost-effective computer-aided tools to assist the clinical diagnosis and management of dysarthria. In this paper we propose a novel automatic dysarthric speech detection approach based on analyses of pairwise distance matrices using convolutional neural networks (CNNs). We represent utterances through articulatory posteriors and consider pairs of phonetically-balanced representations, with one representation from a healthy speaker (i.e., the reference representation) and the other representation from the test speaker (i.e., test representation). Given such pairs of reference and test representations, features are first extracted using a feature extraction front-end, a frame-level distance matrix is computed, and the obtained distance matrix is considered as an image by a CNN-based binary classifier. The feature extraction,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVoice and Speech Disorders · Speech Recognition and Synthesis · Dysphagia Assessment and Management
