Enhancing Audio Augmentation Methods with Consistency Learning

Turab Iqbal; Karim Helwani; Arvindh Krishnaswamy; Wenwu Wang

arXiv:2102.05151·cs.SD·April 20, 2021

Enhancing Audio Augmentation Methods with Consistency Learning

Turab Iqbal, Karim Helwani, Arvindh Krishnaswamy, Wenwu Wang

PDF

Open Access

TL;DR

This paper explores how explicitly enforcing consistency constraints in training objectives can enhance audio classification performance, building on data augmentation techniques to improve model invariance to transformations.

Contribution

It introduces training objectives that explicitly impose consistency constraints, demonstrating their effectiveness in improving deep convolutional neural network performance on audio classification tasks.

Findings

01

Consistency measures are not captured by cross-entropy loss.

02

Incorporating consistency into the loss improves classification accuracy.

03

Explicit consistency enforcement enhances data augmentation benefits.

Abstract

Data augmentation is an inexpensive way to increase training data diversity and is commonly achieved via transformations of existing data. For tasks such as classification, there is a good case for learning representations of the data that are invariant to such transformations, yet this is not explicitly enforced by classification losses such as the cross-entropy loss. This paper investigates the use of training objectives that explicitly impose this consistency constraint and how it can impact downstream audio classification tasks. In the context of deep convolutional neural networks in the supervised setting, we show empirically that certain measures of consistency are not implicitly captured by the cross-entropy loss and that incorporating such measures into the loss function can improve the performance of audio classification systems. Put another way, we demonstrate how existing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies