Capsule Routing for Sound Event Detection
Turab Iqbal, Yong Xu, Qiuqiang Kong, Wenwu Wang

TL;DR
This paper introduces a capsule routing neural network for sound event detection, achieving state-of-the-art classification performance and reducing overfitting by capturing global coherence in audio signals.
Contribution
It applies capsule routing to sound event detection, demonstrating improved generalization and performance over existing architectures.
Findings
Achieved an F-score of 58.6% on DCASE 2017 Task 4
Reduced overfitting compared to other models
Improved global coherence learning in audio classification
Abstract
The detection of acoustic scenes is a challenging problem in which environmental sound events must be detected from a given audio signal. This includes classifying the events as well as estimating their onset and offset times. We approach this problem with a neural network architecture that uses the recently-proposed capsule routing mechanism. A capsule is a group of activation units representing a set of properties for an entity of interest, and the purpose of routing is to identify part-whole relationships between capsules. That is, a capsule in one layer is assumed to belong to a capsule in the layer above in terms of the entity being represented. Using capsule routing, we wish to train a network that can learn global coherence implicitly, thereby improving generalization performance. Our proposed method is evaluated on Task 4 of the DCASE 2017 challenge. Results show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies
