Two-Stage Voice Anonymization for Enhanced Privacy

Francesco Nespoli; Daniel Barreda; Joerg Bitzer; Patrick A. Naylor

arXiv:2306.16069·eess.AS·June 29, 2023

Two-Stage Voice Anonymization for Enhanced Privacy

Francesco Nespoli, Daniel Barreda, Joerg Bitzer, Patrick A. Naylor

PDF

Open Access

TL;DR

This paper introduces a two-stage voice anonymization system combining a state-of-the-art model with zero-shot voice conversion to enhance privacy while maintaining speech pitch, and proposes a new metric for evaluation.

Contribution

It presents a novel two-stage anonymization pipeline that improves privacy preservation and utility in speech data, along with a new evaluation metric.

Findings

01

Strong privacy preservation demonstrated

02

Effective pitch retention in anonymized speech

03

New metric for privacy-utility evaluation

Abstract

In recent years, the need for privacy preservation when manipulating or storing personal data, including speech , has become a major issue. In this paper, we present a system addressing the speaker-level anonymization problem. We propose and evaluate a two-stage anonymization pipeline exploiting a state-of-the-art anonymization model described in the Voice Privacy Challenge 2022 in combination with a zero-shot voice conversion architecture able to capture speaker characteristics from a few seconds of speech. We show this architecture can lead to strong privacy preservation while preserving pitch information. Finally, we propose a new compressed metric to evaluate anonymization systems in privacy scenarios with different constraints on privacy and utility.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Voice and Speech Disorders