A privacy-preserving method using secret key for convolutional neural network-based speech classification
Shoko Niwa, Sayaka Shiota, Hitoshi Kiya

TL;DR
This paper introduces a secret key-based encryption method for CNN-based speech classification that preserves privacy by encrypting data and model parameters, enabling accurate classification and robustness against reconstruction attacks.
Contribution
It presents a novel encryption technique using invertible random matrices for speech data and CNN models, specifically designed for speech classification tasks.
Findings
Encrypted speech data enables accurate classification with the correct key.
The method maintains classification performance in ASR and ASV tasks.
Encrypted data shows robustness against reconstruction attacks.
Abstract
In this paper, we propose a privacy-preserving method with a secret key for convolutional neural network (CNN)-based speech classification tasks. Recently, many methods related to privacy preservation have been developed in image classification research fields. In contrast, in speech classification research fields, little research has considered these risks. To promote research on privacy preservation for speech classification, we provide an encryption method with a secret key in CNN-based speech classification systems. The encryption method is based on a random matrix with an invertible inverse. The encrypted speech data with a correct key can be accepted by a model with an encrypted kernel generated using an inverse matrix of a random matrix. Whereas the encrypted speech data is strongly distorted, the classification tasks can be correctly performed when a correct key is provided.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsChaos-based Image/Signal Encryption · Speech Recognition and Synthesis · Digital Media Forensic Detection
