A Comparison of Audio Signal Preprocessing Methods for Deep Neural   Networks on Music Tagging

Keunwoo Choi; Gy\"orgy Fazekas; Kyunghyun Cho; Mark Sandler

arXiv:1709.01922·cs.SD·February 23, 2021

A Comparison of Audio Signal Preprocessing Methods for Deep Neural Networks on Music Tagging

Keunwoo Choi, Gy\"orgy Fazekas, Kyunghyun Cho, Mark Sandler

PDF

1 Repo

TL;DR

This paper empirically evaluates various audio preprocessing techniques for music tagging with deep neural networks, finding that most common methods are redundant except for magnitude compression, which significantly impacts performance.

Contribution

It provides a comprehensive experimental comparison of audio preprocessing methods, highlighting the importance of magnitude compression for music tagging tasks.

Findings

01

Magnitude compression is essential for effective preprocessing.

02

Most traditional preprocessing techniques are redundant.

03

Preprocessing choices significantly affect neural network performance.

Abstract

In this paper, we empirically investigate the effect of audio preprocessing on music tagging with deep neural networks. We perform comprehensive experiments involving audio preprocessing using different time-frequency representations, logarithmic magnitude compression, frequency weighting, and scaling. We show that many commonly used input preprocessing techniques are redundant except magnitude compression.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

GorillaBus/urban-audio-classifier
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.