Multi-Level and Multi-Scale Feature Aggregation Using Sample-level Deep   Convolutional Neural Networks for Music Classification

Jongpil Lee; Juhan Nam

arXiv:1706.06810·cs.SD·June 22, 2017·5 cites

Multi-Level and Multi-Scale Feature Aggregation Using Sample-level Deep Convolutional Neural Networks for Music Classification

Jongpil Lee, Juhan Nam

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel music classification method that leverages multi-level and multi-scale feature aggregation from sample-level deep CNNs trained on raw waveforms, achieving state-of-the-art results.

Contribution

It proposes a new approach combining multi-level and multi-scale feature aggregation with pre-trained sample-level deep CNNs for improved music classification.

Findings

01

Achieves state-of-the-art results on multiple datasets

02

Effectively captures multi-level and multi-scale features

03

Demonstrates the effectiveness of raw waveform-based CNNs

Abstract

Music tag words that describe music audio by text have different levels of abstraction. Taking this issue into account, we propose a music classification approach that aggregates multi-level and multi-scale features using pre-trained feature extractors. In particular, the feature extractors are trained in sample-level deep convolutional neural networks using raw waveforms. We show that this approach achieves state-of-the-art results on several music classification datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jongpillee/music_dataset_split
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies