Comparison of Speaker Role Recognition and Speaker Enrollment Protocol for conversational Clinical Interviews
Rachid Riad, Hadrien Titeux, Laurie Lemoine, Justine, Montillot, Agnes Sliwinski, Jennifer Hamet Bagnou, Xuan Nga Cao and, Anne-Catherine Bachoud-L\'evi, Emmanuel Dupoux

TL;DR
This study compares speaker role recognition and speaker enrollment methods using neural networks on clinical conversations, finding that role recognition performs best and emphasizing the importance of domain-specific retraining.
Contribution
It provides a comparative evaluation of speaker identification methods in clinical settings, highlighting the effectiveness of speaker role recognition models.
Findings
Speaker role recognition outperforms speaker enrollment.
Retraining models with in-domain data improves performance.
Results are consistent across different patient demographics.
Abstract
Conversations between a clinician and a patient, in natural conditions, are valuable sources of information for medical follow-up. The automatic analysis of these dialogues could help extract new language markers and speed-up the clinicians' reports. Yet, it is not clear which speech processing pipeline is the most performing to detect and identify the speaker turns, especially for individuals with speech and language disorders. Here, we proposed a split of the data that allows conducting a comparative evaluation of speaker role recognition and speaker enrollment methods to solve this task. We trained end-to-end neural network architectures to adapt to each task and evaluate each approach under the same metric. Experimental results are reported on naturalistic clinical conversations between Neuropsychologist and Interviewees, at different stages of Huntington's disease. We found that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and dialogue systems
