DiDOTS: Knowledge Distillation from Large-Language-Models for Dementia   Obfuscation in Transcribed Speech

Dominika Woszczyk; Soteris Demetriou

arXiv:2410.04188·cs.CL·October 8, 2024

DiDOTS: Knowledge Distillation from Large-Language-Models for Dementia Obfuscation in Transcribed Speech

Dominika Woszczyk, Soteris Demetriou

PDF

Open Access

TL;DR

This paper introduces DiDOTS, a knowledge distillation method that leverages large language models to obfuscate dementia-related information in speech transcripts, enhancing privacy while maintaining utility.

Contribution

It proposes a novel, parameter-efficient distillation approach from LLMs for dementia obfuscation, addressing the challenges of large model deployment and fragility.

Findings

01

LLMs outperform existing obfuscation methods in privacy protection.

02

DiDOTS achieves comparable privacy performance with significantly fewer parameters.

03

Humans rate DiDOTS as better at preserving transcript utility.

Abstract

Dementia is a sensitive neurocognitive disorder affecting tens of millions of people worldwide and its cases are expected to triple by 2050. Alarmingly, recent advancements in dementia classification make it possible for adversaries to violate affected individuals' privacy and infer their sensitive condition from speech transcriptions. Existing obfuscation methods in text have never been applied for dementia and depend on the availability of large labeled datasets which are challenging to collect for sensitive medical attributes. In this work, we bridge this research gap and tackle the above issues by leveraging Large-Language-Models (LLMs) with diverse prompt designs (zero-shot, few-shot, and knowledge-based) to obfuscate dementia in speech transcripts. Our evaluation shows that LLMs are more effective dementia obfuscators compared to competing methods. However, they have billions of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis