TL;DR
This paper empirically evaluates various audio preprocessing techniques for music tagging with deep neural networks, finding that most common methods are redundant except for magnitude compression, which significantly impacts performance.
Contribution
It provides a comprehensive experimental comparison of audio preprocessing methods, highlighting the importance of magnitude compression for music tagging tasks.
Findings
Magnitude compression is essential for effective preprocessing.
Most traditional preprocessing techniques are redundant.
Preprocessing choices significantly affect neural network performance.
Abstract
In this paper, we empirically investigate the effect of audio preprocessing on music tagging with deep neural networks. We perform comprehensive experiments involving audio preprocessing using different time-frequency representations, logarithmic magnitude compression, frequency weighting, and scaling. We show that many commonly used input preprocessing techniques are redundant except magnitude compression.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
