Enhancing Audio Augmentation Methods with Consistency Learning
Turab Iqbal, Karim Helwani, Arvindh Krishnaswamy, Wenwu Wang

TL;DR
This paper explores how explicitly enforcing consistency constraints in training objectives can enhance audio classification performance, building on data augmentation techniques to improve model invariance to transformations.
Contribution
It introduces training objectives that explicitly impose consistency constraints, demonstrating their effectiveness in improving deep convolutional neural network performance on audio classification tasks.
Findings
Consistency measures are not captured by cross-entropy loss.
Incorporating consistency into the loss improves classification accuracy.
Explicit consistency enforcement enhances data augmentation benefits.
Abstract
Data augmentation is an inexpensive way to increase training data diversity and is commonly achieved via transformations of existing data. For tasks such as classification, there is a good case for learning representations of the data that are invariant to such transformations, yet this is not explicitly enforced by classification losses such as the cross-entropy loss. This paper investigates the use of training objectives that explicitly impose this consistency constraint and how it can impact downstream audio classification tasks. In the context of deep convolutional neural networks in the supervised setting, we show empirically that certain measures of consistency are not implicitly captured by the cross-entropy loss and that incorporating such measures into the loss function can improve the performance of audio classification systems. Put another way, we demonstrate how existing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies
