TL;DR
This paper investigates the effectiveness of randomly weighted convolutional neural networks as feature extractors for music audio classification, highlighting the importance of architecture choice over training.
Contribution
It provides a comprehensive evaluation of various deep architectures with random weights for audio classification, emphasizing architecture's role in performance.
Findings
Randomly weighted CNNs can serve as effective feature extractors.
Architecture choice significantly impacts classification accuracy.
Deep architectures outperform simpler models even without training.
Abstract
The computer vision literature shows that randomly weighted neural networks perform reasonably as feature extractors. Following this idea, we study how non-trained (randomly weighted) convolutional neural networks perform as feature extractors for (music) audio classification tasks. We use features extracted from the embeddings of deep architectures as input to a classifier - with the goal to compare classification accuracies when using different randomly weighted architectures. By following this methodology, we run a comprehensive evaluation of the current deep architectures for audio classification, and provide evidence that the architectures alone are an important piece for resolving (music) audio problems using deep neural networks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
