Ensemble of Discriminators for Domain Adaptation in Multiple Sound Source 2D Localization
Guillaume Le Moing, Don Joven Agravante, Tadanobu Inoue, Jayakorn, Vongkulbhisal, Asim Munawar, Ryuki Tachibana, Phongtharin Vinayavekhin

TL;DR
This paper presents an ensemble of discriminators within an adversarial domain adaptation framework to enhance multi-source sound localization accuracy, effectively bridging the gap between synthetic training data and real-world recordings.
Contribution
It introduces a novel ensemble discriminator approach at multiple feature levels for domain adaptation in sound source localization, improving performance without real data labels.
Findings
Ensemble discriminator improves localization accuracy.
Method reduces domain mismatch effects.
No real data labels needed for improved performance.
Abstract
This paper introduces an ensemble of discriminators that improves the accuracy of a domain adaptation technique for the localization of multiple sound sources. Recently, deep neural networks have led to promising results for this task, yet they require a large amount of labeled data for training. Recording and labeling such datasets is very costly, especially because data needs to be diverse enough to cover different acoustic conditions. In this paper, we leverage acoustic simulators to inexpensively generate labeled training samples. However, models trained on synthetic data tend to perform poorly with real-world recordings due to the domain mismatch. For this, we explore two domain adaptation methods using adversarial learning for sound source localization which use labeled synthetic data and unlabeled real data. We propose a novel ensemble approach that combines discriminators…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis
