Distinctive and Natural Speaker Anonymization via Singular Value   Transformation-assisted Matrix

Jixun Yao; Qing Wang; Pengcheng Guo; Ziqian Ning; Lei Xie

arXiv:2405.10786·eess.AS·May 20, 2024·IEEE ACM Trans. Audio Speech Lang. Process.

Distinctive and Natural Speaker Anonymization via Singular Value Transformation-assisted Matrix

Jixun Yao, Qing Wang, Pengcheng Guo, Ziqian Ning, Lei Xie

PDF

Open Access

TL;DR

This paper introduces a novel speaker anonymization method using singular value transformation of speaker-related matrices, effectively protecting privacy while preserving speech naturalness and speaker distinctiveness.

Contribution

It proposes a matrix-based anonymization technique leveraging SVD and attention mechanisms to improve privacy and speech quality over existing methods.

Findings

01

Effective privacy protection against various attack scenarios

02

Maintains speech naturalness and speaker distinctiveness

03

Outperforms baseline methods in experiments

Abstract

Speaker anonymization is an effective privacy protection solution that aims to conceal the speaker's identity while preserving the naturalness and distinctiveness of the original speech. Mainstream approaches use an utterance-level vector from a pre-trained automatic speaker verification (ASV) model to represent speaker identity, which is then averaged or modified for anonymization. However, these systems suffer from deterioration in the naturalness of anonymized speech, degradation in speaker distinctiveness, and severe privacy leakage against powerful attackers. To address these issues and especially generate more natural and distinctive anonymized speech, we propose a novel speaker anonymization approach that models a matrix related to speaker identity and transforms it into an anonymized singular value transformation-assisted matrix to conceal the original speaker identity. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Speech and Audio Processing