Cross-domain Semi-Supervised Audio Event Classification Using   Contrastive Regularization

Donmoon Lee; Kyogu Lee

arXiv:2109.14508·cs.SD·September 30, 2021

Cross-domain Semi-Supervised Audio Event Classification Using Contrastive Regularization

Donmoon Lee, Kyogu Lee

PDF

Open Access

TL;DR

This paper introduces a novel semi-supervised audio event classification method that employs contrastive regularization and batch audio mixing to improve performance, stability, and generalization across different data domains.

Contribution

The study presents a new contrastive regularization technique combined with a simple audio mixing augmentation for semi-supervised learning in cross-domain audio classification.

Findings

01

Improved classification accuracy across diverse domains.

02

Enhanced training stability and generalization.

03

Effective use of unlabeled data with different class distributions.

Abstract

In this study, we proposed a novel semi-supervised training method that uses unlabeled data with a class distribution that is completely different from the target data or data without a target label. To this end, we introduce a contrastive regularization that is designed to be target task-oriented and trained simultaneously. In addition, we propose an audio mixing based simple augmentation strategy that performed in batch samples. Experimental results validate that the proposed method successfully contributed to the performance improvement, and particularly showed that it has advantages in stable training and generalization.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Flow Measurement and Analysis