Unified Pathological Speech Analysis with Prompt Tuning

Fei Yang; Xuenan Xu; Mengyue Wu; Kai Yu

arXiv:2411.04142·eess.AS·November 8, 2024

Unified Pathological Speech Analysis with Prompt Tuning

Fei Yang, Xuenan Xu, Mengyue Wu, Kai Yu

PDF

Open Access

TL;DR

This paper introduces a unified speech analysis system using prompt tuning to efficiently detect multiple diseases like Alzheimer's, depression, and Parkinson's from speech, sharing knowledge across tasks and improving performance.

Contribution

It presents a novel unified approach with prompt tuning for multi-disease speech analysis, enhancing efficiency and performance over disease-specific models.

Findings

01

Strong performance across Alzheimer's, depression, and Parkinson's detection.

02

Faster convergence and higher F1 scores with shared knowledge.

03

Efficient training by fine-tuning only a small part of the model.

Abstract

Pathological speech analysis has been of interest in the detection of certain diseases like depression and Alzheimer's disease and attracts much interest from researchers. However, previous pathological speech analysis models are commonly designed for a specific disease while overlooking the connection between diseases, which may constrain performance and lower training efficiency. Instead of fine-tuning deep models for different tasks, prompt tuning is a much more efficient training paradigm. We thus propose a unified pathological speech analysis system for as many as three diseases with the prompt tuning technique. This system uses prompt tuning to adjust only a small part of the parameters to detect different diseases from speeches of possible patients. Our system leverages a pre-trained spoken language model and demonstrates strong performance across multiple disorders while only…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing