Unveiling Interpretability in Self-Supervised Speech Representations for   Parkinson's Diagnosis

David Gimeno-G\'omez; Catarina Botelho; Anna Pompili; Alberto; Abad; Carlos-D. Mart\'inez-Hinarejos

arXiv:2412.02006·cs.CV·February 11, 2025

Unveiling Interpretability in Self-Supervised Speech Representations for Parkinson's Diagnosis

David Gimeno-G\'omez, Catarina Botelho, Anna Pompili, Alberto, Abad, Carlos-D. Mart\'inez-Hinarejos

PDF

1 Repo

TL;DR

This paper introduces an interpretable framework for Parkinson's diagnosis using self-supervised speech representations, enhancing transparency and clinical trustworthiness while maintaining competitive accuracy across diverse speech benchmarks.

Contribution

The paper presents a novel cross-attention based interpretability framework tailored for self-supervised speech embeddings in Parkinson's diagnosis, addressing the black-box challenge.

Findings

01

Effective identification of meaningful speech patterns in embeddings.

02

Enhanced interpretability through temporal and embedding analysis.

03

Competitive classification accuracy with robustness in cross-lingual scenarios.

Abstract

Recent works in pathological speech analysis have increasingly relied on powerful self-supervised speech representations, leading to promising results. However, the complex, black-box nature of these embeddings and the limited research on their interpretability significantly restrict their adoption for clinical diagnosis. To address this gap, we propose a novel, interpretable framework specifically designed to support Parkinson's Disease (PD) diagnosis. Through the design of simple yet effective cross-attention mechanisms for both embedding- and temporal-level analysis, the proposed framework offers interpretability from two distinct but complementary perspectives. Experimental findings across five well-established speech benchmarks for PD detection demonstrate the framework's capability to identify meaningful speech patterns within self-supervised representations for a wide range of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

david-gimeno/interpreting-ssl-parkinson-speech
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.