SemiPL: A Semi-supervised Method for Event Sound Source Localization

Yue Li; Baiqiao Yin; Jinfu Liu; Jiajun Wen; Jiaying Lin; Mengyuan Liu

arXiv:2404.19615·cs.CV·May 1, 2024

SemiPL: A Semi-supervised Method for Event Sound Source Localization

Yue Li, Baiqiao Yin, Jinfu Liu, Jiajun Wen, Jiaying Lin, Mengyuan Liu

PDF

Open Access 1 Repo

TL;DR

SemiPL introduces a semi-supervised approach to enhance event sound source localization in complex, chaotic environments, improving performance over existing models by leveraging parameter tuning and semi-supervised learning techniques.

Contribution

The paper proposes SemiPL, a semi-supervised method that improves sound source localization accuracy in complex datasets, extending previous contrastive learning frameworks.

Findings

01

SemiPL achieves 12.2% improvement in cIoU on Chaotic World dataset.

02

Parameter tuning positively impacts model performance.

03

SemiPL outperforms existing models in complex event scenarios.

Abstract

In recent years, Event Sound Source Localization has been widely applied in various fields. Recent works typically relying on the contrastive learning framework show impressive performance. However, all work is based on large relatively simple datasets. It's also crucial to understand and analyze human behaviors (actions and interactions of people), voices, and sounds in chaotic events in many applications, e.g., crowd management, and emergency response services. In this paper, we apply the existing model to a more complex dataset, explore the influence of parameters on the model, and propose a semi-supervised improvement method SemiPL. With the increase in data quantity and the influence of label quality, self-supervised learning will be an unstoppable trend. The experiment shows that the parameter adjustment will positively affect the existing model. In particular, SSPL achieved an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ly245422/sspl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Speech and Audio Processing

MethodsContrastive Learning