A Streamlined Encoder/Decoder Architecture for Melody Extraction

Tsung-Han Hsieh; Li Su; Yi-Hsuan Yang

arXiv:1810.12947·eess.AS·February 19, 2019·5 cites

A Streamlined Encoder/Decoder Architecture for Melody Extraction

Tsung-Han Hsieh, Li Su, Yi-Hsuan Yang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a simplified encoder/decoder neural network for melody extraction that achieves near state-of-the-art results with fewer layers by utilizing pooling indices and a novel approach for melody existence estimation.

Contribution

The paper presents a streamlined architecture for melody extraction that reduces complexity while maintaining high accuracy, including a new method for melody existence detection.

Findings

01

Achieves near state-of-the-art performance with fewer convolutional layers

02

Uses pooling indices for better localization of melody in frequency

03

Employs a simple argmax for melody existence estimation

Abstract

Melody extraction in polyphonic musical audio is important for music signal processing. In this paper, we propose a novel streamlined encoder/decoder network that is designed for the task. We make two technical contributions. First, drawing inspiration from a state-of-the-art model for semantic pixel-wise segmentation, we pass through the pooling indices between pooling and un-pooling layers to localize the melody in frequency. We can achieve result close to the state-of-the-art with much fewer convolutional layers and simpler convolution modules. Second, we propose a way to use the bottleneck layer of the network to estimate the existence of a melody line for each time frame, and make it possible to use a simple argmax function instead of ad-hoc thresholding to get the final estimation of the melody line. Our experiments on both vocal melody extraction and general melody extraction…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bill317996/Melody-extraction-with-melodic-segnet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies