Exploiting Parallel Audio Recordings to Enforce Device Invariance in   CNN-based Acoustic Scene Classification

Paul Primus; Hamid Eghbal-zadeh; David Eitelsebner; Khaled Koutini,; Andreas Arzt; Gerhard Widmer

arXiv:1909.02869·eess.AS·September 9, 2019

Exploiting Parallel Audio Recordings to Enforce Device Invariance in CNN-based Acoustic Scene Classification

Paul Primus, Hamid Eghbal-zadeh, David Eitelsebner, Khaled Koutini,, Andreas Arzt, Gerhard Widmer

PDF

1 Repo

TL;DR

This paper introduces a novel domain adaptation method for acoustic scene classification that leverages parallel recordings from different devices to learn device-invariant features without requiring labels, improving robustness across device mismatches.

Contribution

It proposes an end-to-end domain-invariant classifier training approach using parallel audio recordings, eliminating the need for labeled data for domain adaptation.

Findings

01

Effective in learning device-invariant features

02

Reduces distribution mismatch impact

03

No labeled data required for adaptation

Abstract

Distribution mismatches between the data seen at training and at application time remain a major challenge in all application areas of machine learning. We study this problem in the context of machine listening (Task 1b of the DCASE 2019 Challenge). We propose a novel approach to learn domain-invariant classifiers in an end-to-end fashion by enforcing equal hidden layer representations for domain-parallel samples, i.e. time-aligned recordings from different recording devices. No classification labels are needed for our domain adaptation (DA) method, which makes the data collection process cheaper.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

OptimusPrimus/dcase2019_task1b
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.