Deformable CNN and Imbalance-Aware Feature Learning for Singing   Technique Classification

Yuya Yamamoto; Juhan Nam; Hiroko Terasawa

arXiv:2206.12230·cs.SD·June 27, 2022

Deformable CNN and Imbalance-Aware Feature Learning for Singing Technique Classification

Yuya Yamamoto, Juhan Nam, Hiroko Terasawa

PDF

Open Access

TL;DR

This paper introduces a deformable CNN combined with imbalance-aware feature learning to improve singing technique classification, addressing dataset imbalance and technique variability.

Contribution

It proposes a novel deformable convolution approach with class-weighted loss for better feature learning in singing technique classification.

Findings

01

Deformable convolution improves classification accuracy.

02

Applying deformable convolution to last two layers yields best results.

03

Class re-training and weighted loss enhance performance.

Abstract

Singing techniques are used for expressive vocal performances by employing temporal fluctuations of the timbre, the pitch, and other components of the voice. Their classification is a challenging task, because of mainly two factors: 1) the fluctuations in singing techniques have a wide variety and are affected by many factors and 2) existing datasets are imbalanced. To deal with these problems, we developed a novel audio feature learning method based on deformable convolution with decoupled training of the feature extractor and the classifier using a class-weighted loss function. The experimental results show the following: 1) the deformable convolution improves the classification results, particularly when it is applied to the last two convolutional layers, and 2) both re-training the classifier and weighting the cross-entropy loss function by a smoothed inverse frequency enhance the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis