WavRx: a Disease-Agnostic, Generalizable, and Privacy-Preserving Speech Health Diagnostic Model
Yi Zhu, Tiago Falk

TL;DR
WavRx is a novel speech-based health diagnostic model that achieves state-of-the-art accuracy across diseases, enhances generalizability, and reduces speaker identity leakage for privacy preservation.
Contribution
It introduces a disease-agnostic, generalizable, and privacy-preserving speech health diagnostic model using universal speech representations.
Findings
State-of-the-art performance on six pathological speech datasets
Significantly reduced speaker identity leakage in health embeddings
Enhanced generalizability across datasets and conditions
Abstract
Speech is known to carry health-related attributes, which has emerged as a novel venue for remote and long-term health monitoring. However, existing models are usually tailored for a specific type of disease, and have been shown to lack generalizability across datasets. Furthermore, concerns have been raised recently towards the leakage of speaker identity from health embeddings. To mitigate these limitations, we propose WavRx, a speech health diagnostics model that captures the respiration and articulation related dynamics from a universal speech representation. Our in-domain and cross-domain experiments on six pathological speech datasets demonstrate WavRx as a new state-of-the-art health diagnostic model. Furthermore, we show that the amount of speaker identity entailed in the WavRx health embeddings is significantly reduced without extra guidance during training. An in-depth…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVoice and Speech Disorders
