Emotion Filtering at the Edge

Ranya Aloufi; Hamed Haddadi; David Boyle

arXiv:1909.08500·eess.AS·September 19, 2019

Emotion Filtering at the Edge

Ranya Aloufi, Hamed Haddadi, David Boyle

PDF

Open Access

TL;DR

This paper presents a privacy-preserving method for emotion filtering in voice inputs at the edge using CycleGAN, effectively reducing emotional state identification while maintaining speech recognition accuracy on low-cost devices.

Contribution

The authors introduce an edge-based emotion filtering approach using CycleGAN to protect user privacy without sacrificing speech recognition performance.

Findings

01

Emotion identification reduced by ~91%

02

Speech recognition accuracy differs only ~0.16% from cloud-based methods

03

Effective implementation on Raspberry Pi 4

Abstract

Voice controlled devices and services have become very popular in the consumer IoT. Cloud-based speech analysis services extract information from voice inputs using speech recognition techniques. Services providers can thus build very accurate profiles of users' demographic categories, personal preferences, emotional states, etc., and may therefore significantly compromise their privacy. To address this problem, we have developed a privacy-preserving intermediate layer between users and cloud services to sanitize voice input directly at edge devices. We use CycleGAN-based speech conversion to remove sensitive information from raw voice input signals before regenerating neutralized signals for forwarding. We implement and evaluate our emotion filtering approach using a relatively cheap Raspberry Pi 4, and show that performance accuracy is not compromised at the edge. In fact, signals…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis