Robust Lossy Audio Compression Identification

Hendrik Vincent Koops; Gianluca Micchi; Elio Quinton

arXiv:2407.21545·cs.SD·August 1, 2024

Robust Lossy Audio Compression Identification

Hendrik Vincent Koops, Gianluca Micchi, Elio Quinton

PDF

Open Access

TL;DR

This paper investigates the robustness of lossy audio compression identification models, revealing their vulnerability to unseen codec parameters and proposing a new training strategy to improve generalization.

Contribution

It demonstrates the lack of robustness in existing models and introduces a masking-based training method to enhance model generalization across unseen codec settings.

Findings

01

Models are sensitive to codec parameter variations.

02

Masking input spectrograms improves robustness.

03

Proposed method significantly increases generalization capability.

Abstract

Previous research contributions on blind lossy compression identification report near perfect performance metrics on their test set, across a variety of codecs and bit rates. However, we show that such results can be deceptive and may not accurately represent true ability of the system to tackle the task at hand. In this article, we present an investigation into the robustness and generalisation capability of a lossy audio identification model. Our contributions are as follows. (1) We show the lack of robustness to codec parameter variations of a model equivalent to prior art. In particular, when naively training a lossy compression detection model on a dataset of music recordings processed with a range of codecs and their lossless counterparts, we obtain near perfect performance metrics on the held-out test set, but severely degraded performance on lossy tracks produced with codec…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing