IDMap: A Pseudo-Speaker Generator Framework Based on Speaker Identity Index to Vector Mapping
Zeyan Liu, Liping Chen, Kong Aik Lee, Zhenhua Ling

TL;DR
This paper introduces IDMap, a novel pseudo-speaker generator framework that improves voice privacy by enhancing pseudo-speaker uniqueness and reducing computational costs, suitable for large-scale voice anonymization tasks.
Contribution
The paper proposes IDMap, a new framework for pseudo-speaker generation using speaker identity index to vector mapping, addressing limitations of existing methods in uniqueness and efficiency.
Findings
Enhanced pseudo-speaker uniqueness improves privacy protection.
Reduced computational cost compared to model-based methods.
Effective in large-scale scenarios with many pseudo-speakers.
Abstract
Facilitated by the speech generation framework that disentangles speech into content, speaker, and prosody, voice anonymization is accomplished by substituting the original speaker embedding vector with that of a pseudo-speaker. In this framework, the pseudo-speaker generation forms a fundamental challenge. Current pseudo-speaker generation methods demonstrate limitations in the uniqueness of pseudo-speakers, consequently restricting their effectiveness in voice privacy protection. Besides, existing model-based methods suffer from heavy computation costs. Especially, in the large-scale scenario where a huge number of pseudo-speakers are generated, the limitations of uniqueness and computational inefficiency become more significant. To this end, this paper proposes a framework for pseudo-speaker generation, which establishes a mapping from speaker identity index to speaker vector in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Voice and Speech Disorders · Speech and Audio Processing
