Revisiting the Privacy of Low-Frequency Speech Signals: Exploring Resampling Methods, Evaluation Scenarios, and Speaker Characteristics
Jule Pohlhausen, J\"org Bitzer

TL;DR
This paper investigates how low-frequency audio resampling and anti-aliasing filtering affect speech privacy and utility, revealing that anti-aliasing filters are crucial for privacy protection without severely impacting speech recognition performance.
Contribution
It provides a comparative analysis of resampling methods and evaluates privacy and utility impacts across different speaker characteristics and filtering techniques.
Findings
Anti-aliasing filtering significantly enhances speech privacy.
Sampling at 800 Hz retains most speech intelligibility.
Speaker sex and pitch influence privacy and utility outcomes.
Abstract
While audio recordings in real life provide insights into social dynamics and conversational behavior, they also raise concerns about the privacy of personal, sensitive data. This article explores the effectiveness of restricting recordings to low-frequency audio to protect spoken content. For resampling the audio signals to different sampling rates, we compare the effect of employing anti-aliasing filtering. Privacy enhancement is measured by an increased word error rate of automatic speech recognition models. The impact on utility performance is measured with voice activity detection models. Our experimental results show that for clean recordings, models trained with a sampling rate of up to 800 Hz transcribe the majority of words correctly. For both models, we analyzed the impact of the speaker's sex and pitch, and we demonstrated that missing anti-aliasing filters more strongly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Emotion and Mood Recognition
