DiDOTS: Knowledge Distillation from Large-Language-Models for Dementia Obfuscation in Transcribed Speech
Dominika Woszczyk, Soteris Demetriou

TL;DR
This paper introduces DiDOTS, a knowledge distillation method that leverages large language models to obfuscate dementia-related information in speech transcripts, enhancing privacy while maintaining utility.
Contribution
It proposes a novel, parameter-efficient distillation approach from LLMs for dementia obfuscation, addressing the challenges of large model deployment and fragility.
Findings
LLMs outperform existing obfuscation methods in privacy protection.
DiDOTS achieves comparable privacy performance with significantly fewer parameters.
Humans rate DiDOTS as better at preserving transcript utility.
Abstract
Dementia is a sensitive neurocognitive disorder affecting tens of millions of people worldwide and its cases are expected to triple by 2050. Alarmingly, recent advancements in dementia classification make it possible for adversaries to violate affected individuals' privacy and infer their sensitive condition from speech transcriptions. Existing obfuscation methods in text have never been applied for dementia and depend on the availability of large labeled datasets which are challenging to collect for sensitive medical attributes. In this work, we bridge this research gap and tackle the above issues by leveraging Large-Language-Models (LLMs) with diverse prompt designs (zero-shot, few-shot, and knowledge-based) to obfuscate dementia in speech transcripts. Our evaluation shows that LLMs are more effective dementia obfuscators compared to competing methods. However, they have billions of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis
