SecureSpeech: Prompt-based Speaker and Content Protection

Belinda Soh Hui Hui; Xiaoxiao Miao; Xin Wang

arXiv:2507.07799·cs.SD·July 11, 2025

SecureSpeech: Prompt-based Speaker and Content Protection

Belinda Soh Hui Hui, Xiaoxiao Miao, Xin Wang

PDF

Open Access

TL;DR

SecureSpeech introduces a prompt-based speech generation method that anonymizes both speaker identity and spoken content, enhancing privacy while maintaining speech quality and content fidelity.

Contribution

The paper presents a novel prompt-based pipeline for dual speaker and content anonymization in speech synthesis, addressing privacy concerns in speech data.

Findings

01

Achieves significant privacy protection against speaker re-identification

02

Maintains high speech quality and content retention

03

Analyzes bias introduced by different speaker descriptions

Abstract

Given the increasing privacy concerns from identity theft and the re-identification of speakers through content in the speech field, this paper proposes a prompt-based speech generation pipeline that ensures dual anonymization of both speaker identity and spoken content. This is addressed through 1) generating a speaker identity unlinkable to the source speaker, controlled by descriptors, and 2) replacing sensitive content within the original text using a name entity recognition model and a large language model. The pipeline utilizes the anonymized speaker identity and text to generate high-fidelity, privacy-friendly speech via a text-to-speech synthesis model. Experimental results demonstrate an achievement of significant privacy protection while maintaining a decent level of content retention and audio quality. This paper also investigates the impact of varying speaker descriptions on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Authorship Attribution and Profiling · Digital Media Forensic Detection