Ensemble of Discriminators for Domain Adaptation in Multiple Sound   Source 2D Localization

Guillaume Le Moing; Don Joven Agravante; Tadanobu Inoue; Jayakorn; Vongkulbhisal; Asim Munawar; Ryuki Tachibana; Phongtharin Vinayavekhin

arXiv:2012.05908·eess.AS·March 17, 2021·1 cites

Ensemble of Discriminators for Domain Adaptation in Multiple Sound Source 2D Localization

Guillaume Le Moing, Don Joven Agravante, Tadanobu Inoue, Jayakorn, Vongkulbhisal, Asim Munawar, Ryuki Tachibana, Phongtharin Vinayavekhin

PDF

Open Access

TL;DR

This paper presents an ensemble of discriminators within an adversarial domain adaptation framework to enhance multi-source sound localization accuracy, effectively bridging the gap between synthetic training data and real-world recordings.

Contribution

It introduces a novel ensemble discriminator approach at multiple feature levels for domain adaptation in sound source localization, improving performance without real data labels.

Findings

01

Ensemble discriminator improves localization accuracy.

02

Method reduces domain mismatch effects.

03

No real data labels needed for improved performance.

Abstract

This paper introduces an ensemble of discriminators that improves the accuracy of a domain adaptation technique for the localization of multiple sound sources. Recently, deep neural networks have led to promising results for this task, yet they require a large amount of labeled data for training. Recording and labeling such datasets is very costly, especially because data needs to be diverse enough to cover different acoustic conditions. In this paper, we leverage acoustic simulators to inexpensively generate labeled training samples. However, models trained on synthetic data tend to perform poorly with real-world recordings due to the domain mismatch. For this, we explore two domain adaptation methods using adversarial learning for sound source localization which use labeled synthetic data and unlabeled real data. We propose a novel ensemble approach that combines discriminators…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis