TL;DR
This paper introduces a novel speaker de-identification method that combines formant shifts with functional data analysis of f0 trajectories, effectively concealing speaker-specific pitch cues and enhancing privacy in speech data.
Contribution
It proposes a new de-identification technique that manipulates f0 trajectories using functional data analysis alongside formant shifts, improving privacy without requiring training data.
Findings
Improves formant-based de-identification by up to 25%.
Effectively conceals speaker-specific pitch cues.
Operates without training data, suitable for under-resourced languages.
Abstract
Due to a constantly increasing amount of speech data that is stored in different types of databases, voice privacy has become a major concern. To respond to such concern, speech researchers have developed various methods for speaker de-identification. The state-of-the-art solutions utilize deep learning solutions which can be effective but might be unavailable or impractical to apply for, for example, under-resourced languages. Formant modification is a simpler, yet effective method for speaker de-identification which requires no training data. Still, remaining intonational patterns in formant-anonymized speech may contain speaker-dependent cues. This study introduces a novel speaker de-identification method, which, in addition to simple formant shifts, manipulates f0 trajectories based on functional data analysis. The proposed speaker de-identification method will conceal plausibly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
