Separating Content from Speaker Identity in Speech for the Assessment of Cognitive Impairments
Dongseok Heo, Cheul Young Park, Jaemin Cheun, Myung Jin Ko

TL;DR
This paper investigates whether separating content from speaker identity in speech improves cognitive impairment assessment, finding content embeddings more effective but dependent on speaker embedding information.
Contribution
It introduces a framework for separating content from speaker identity in speech and evaluates its effectiveness for cognitive impairment assessment.
Findings
Content embeddings outperform speaker embeddings in assessment accuracy.
Effectiveness of content embeddings depends on the information encoded in speaker embeddings.
Simple classifiers using content embeddings show promising results on DementiaBank Pitt Corpus.
Abstract
Deep speaker embeddings have been shown effective for assessing cognitive impairments aside from their original purpose of speaker verification. However, the research found that speaker embeddings encode speaker identity and an array of information, including speaker demographics, such as sex and age, and speech contents to an extent, which are known confounders in the assessment of cognitive impairments. In this paper, we hypothesize that content information separated from speaker identity using a framework for voice conversion is more effective for assessing cognitive impairments and train simple classifiers for the comparative analysis on the DementiaBank Pitt Corpus. Our results show that while content embeddings have an advantage over speaker embeddings for the defined problem, further experiments show their effectiveness depends on information encoded in speaker embeddings due to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Topic Modeling
