Audio Event and Scene Recognition: A Unified Approach using Strongly and Weakly Labeled Data
Anurag Kumar, Bhiksha Raj

TL;DR
This paper introduces a unified learning framework that combines strongly and weakly labeled data for audio event and scene recognition, improving detection accuracy by leveraging limited fully labeled data alongside weak labels.
Contribution
The paper presents a novel SWSL framework and a graph-based manifold regularization method that effectively integrates weakly and strongly labeled data for audio analysis.
Findings
Improved accuracy in audio event detection using combined data labels.
Effective graph-based approach for semi-supervised learning.
Framework addresses challenges of weakly labeled audio data.
Abstract
In this paper we propose a novel learning framework called Supervised and Weakly Supervised Learning where the goal is to learn simultaneously from weakly and strongly labeled data. Strongly labeled data can be simply understood as fully supervised data where all labeled instances are available. In weakly supervised learning only data is weakly labeled which prevents one from directly applying supervised learning methods. Our proposed framework is motivated by the fact that a small amount of strongly labeled data can give considerable improvement over only weakly supervised learning. The primary problem domain focus of this paper is acoustic event and scene detection in audio recordings. We first propose a naive formulation for leveraging labeled data in both forms. We then propose a more general framework for Supervised and Weakly Supervised Learning (SWSL). Based on this general…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Video Analysis and Summarization
