Supervised Speech Representation Learning for Parkinson's Disease Classification
Parvaneh Janbakhshi, Ina Kodrasi

TL;DR
This paper introduces supervised auto-encoder techniques with adversarial training and joint classification to improve robustness and discriminability of speech representations for Parkinson's disease detection, outperforming unsupervised methods.
Contribution
It presents novel supervised auto-encoder methods with adversarial and joint training strategies for more effective Parkinson's speech classification.
Findings
Supervised representations outperform unsupervised baselines.
Adversarial training reduces speaker variability influence.
Joint training enhances discriminative power.
Abstract
Recently proposed automatic pathological speech classification techniques use unsupervised auto-encoders to obtain a high-level abstract representation of speech. Since these representations are learned based on reconstructing the input, there is no guarantee that they are robust to pathology-unrelated cues such as speaker identity information. Further, these representations are not necessarily discriminative for pathology detection. In this paper, we exploit supervised auto-encoders to extract robust and discriminative speech representations for Parkinson's disease classification. To reduce the influence of speaker variabilities unrelated to pathology, we propose to obtain speaker identity-invariant representations by adversarial training of an auto-encoder and a speaker identification task. To obtain a discriminative representation, we propose to jointly train an auto-encoder and a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVoice and Speech Disorders · Speech Recognition and Synthesis · Music and Audio Processing
