Mitigating Language Mismatch in SSL-Based Speaker Anonymization

Zhe Zhang; Wen-Chin Huang; Xin Wang; Xiaoxiao Miao; Junichi Yamagishi

arXiv:2507.00458·eess.AS·July 2, 2025

Mitigating Language Mismatch in SSL-Based Speaker Anonymization

Zhe Zhang, Wen-Chin Huang, Xin Wang, Xiaoxiao Miao, Junichi Yamagishi

PDF

Open Access

TL;DR

This paper addresses language mismatch in speaker anonymization systems by fine-tuning SSL models with Japanese speech, demonstrating improved utility and privacy across Japanese and Mandarin languages.

Contribution

It introduces language adaptation techniques and multilingual SSL fine-tuning to enhance speaker anonymization performance in non-English languages.

Findings

01

Fine-tuning SSL with Japanese speech improves intelligibility.

02

Multilingual SSL models extend utility across languages.

03

Language adaptation is crucial for robust multilingual anonymization.

Abstract

Speaker anonymization aims to protect speaker identity while preserving content information and the intelligibility of speech. However, most speaker anonymization systems (SASs) are developed and evaluated using only English, resulting in degraded utility for other languages. This paper investigates language mismatch in SASs for Japanese and Mandarin speech. First, we fine-tune a self-supervised learning (SSL)-based content encoder with Japanese speech to verify effective language adaptation. Then, we propose fine-tuning a multilingual SSL model with Japanese speech and evaluating the SAS in Japanese and Mandarin. Downstream experiments show that fine-tuning an English-only SSL model with the target language enhances intelligibility while maintaining privacy and that multilingual SSL further extends SASs' utility across different languages. These findings highlight the importance of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Authorship Attribution and Profiling · Privacy-Preserving Technologies in Data