Cross-domain Semi-Supervised Audio Event Classification Using Contrastive Regularization
Donmoon Lee, Kyogu Lee

TL;DR
This paper introduces a novel semi-supervised audio event classification method that employs contrastive regularization and batch audio mixing to improve performance, stability, and generalization across different data domains.
Contribution
The study presents a new contrastive regularization technique combined with a simple audio mixing augmentation for semi-supervised learning in cross-domain audio classification.
Findings
Improved classification accuracy across diverse domains.
Enhanced training stability and generalization.
Effective use of unlabeled data with different class distributions.
Abstract
In this study, we proposed a novel semi-supervised training method that uses unlabeled data with a class distribution that is completely different from the target data or data without a target label. To this end, we introduce a contrastive regularization that is designed to be target task-oriented and trained simultaneously. In addition, we propose an audio mixing based simple augmentation strategy that performed in batch samples. Experimental results validate that the proposed method successfully contributed to the performance improvement, and particularly showed that it has advantages in stable training and generalization.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Flow Measurement and Analysis
