MP3 Compression To Diminish Adversarial Noise in End-to-End Speech   Recognition

Iustina Andronic; Ludwig K\"urzinger; Edgar Ricardo Chavez Rosas; and Gerhard Rigoll; Bernhard U. Seeber

arXiv:2007.12892·eess.AS·July 28, 2020

MP3 Compression To Diminish Adversarial Noise in End-to-End Speech Recognition

Iustina Andronic, Ludwig K\"urzinger, Edgar Ricardo Chavez Rosas, and Gerhard Rigoll, Bernhard U. Seeber

PDF

1 Repo

TL;DR

This study demonstrates that MP3 compression can effectively reduce adversarial noise in audio inputs, thereby improving speech recognition accuracy against adversarial attacks while minimally affecting normal speech recognition.

Contribution

The paper introduces MP3 compression as a novel method to diminish adversarial noise in audio samples for end-to-end speech recognition systems, validated through objective metrics.

Findings

01

MP3 compression reduces character error rates on adversarial examples.

02

MP3 compression increases signal-to-noise ratio in reconstructed audio.

03

MP3 compression does not significantly affect normal speech recognition accuracy.

Abstract

Audio Adversarial Examples (AAE) represent specially created inputs meant to trick Automatic Speech Recognition (ASR) systems into misclassification. The present work proposes MP3 compression as a means to decrease the impact of Adversarial Noise (AN) in audio samples transcribed by ASR systems. To this end, we generated AAEs with the Fast Gradient Sign Method for an end-to-end, hybrid CTC-attention ASR system. Our method is then validated by two objective indicators: (1) Character Error Rates (CER) that measure the speech decoding performance of four ASR models trained on uncompressed, as well as MP3-compressed data sets and (2) Signal-to-Noise Ratio (SNR) estimated for both uncompressed and MP3-compressed AAEs that are reconstructed in the time domain by feature inversion. We found that MP3 compression applied to AAEs indeed reduces the CER when compared to uncompressed AAEs.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

iustinaabc/ASR-mp3-compression-AAEs
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.