Weakly Supervised Scalable Audio Content Analysis
Anurag Kumar, Bhiksha Raj

TL;DR
This paper introduces a scalable weakly supervised learning framework for audio event detection that leverages web multimedia data with minimal annotation effort, demonstrating the feasibility and competitiveness of multiple instance learning algorithms.
Contribution
It proposes a novel scalable multiple instance learning algorithm for audio event detection using weak labels, reducing annotation costs and effort.
Findings
Weakly supervised learning is feasible for audio event detection.
The proposed algorithm is competitive with existing multiple instance learning methods.
Scalable approach effectively utilizes web multimedia data.
Abstract
Audio Event Detection is an important task for content analysis of multimedia data. Most of the current works on detection of audio events is driven through supervised learning approaches. We propose a weakly supervised learning framework which can make use of the tremendous amount of web multimedia data with significantly reduced annotation effort and expense. Specifically, we use several multiple instance learning algorithms to show that audio event detection through weak labels is feasible. We also propose a novel scalable multiple instance learning algorithm and show that its competitive with other multiple instance learning algorithms for audio event detection tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
