Automatic Music Highlight Extraction using Convolutional Recurrent   Attention Networks

Jung-Woo Ha; Adrian Kim; Chanju Kim; Jangyeon Park; Sunghun Kim

arXiv:1712.05901·cs.LG·December 19, 2017·5 cites

Automatic Music Highlight Extraction using Convolutional Recurrent Attention Networks

Jung-Woo Ha, Adrian Kim, Chanju Kim, Jangyeon Park, Sunghun Kim

PDF

Open Access

TL;DR

This paper introduces a novel high-level feature extraction method for music highlights using convolutional recurrent attention networks, outperforming existing approaches on a large Korean music dataset.

Contribution

The paper presents a new CRAN-based approach that leverages attention mechanisms for effective music highlight extraction, emphasizing high-level features over traditional low-level signal features.

Findings

01

CRAN outperforms baseline methods in highlight extraction accuracy.

02

Attention mechanisms improve the model's ability to identify significant music snippets.

03

The method demonstrates robustness across a large dataset of popular Korean tracks.

Abstract

Music highlights are valuable contents for music services. Most methods focused on low-level signal features. We propose a method for extracting highlights using high-level features from convolutional recurrent attention networks (CRAN). CRAN utilizes convolution and recurrent layers for sequential learning with an attention mechanism. The attention allows CRAN to capture significant snippets for distinguishing between genres, thus being used as a high-level feature. CRAN was evaluated on over 32,000 popular tracks in Korea for two months. Experimental results show our method outperforms three baseline methods through quantitative and qualitative evaluations. Also, we analyze the effects of attention and sequence information on performance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Video Analysis and Summarization · Speech and Audio Processing

MethodsConvolution