Surrey-cvssp system for DCASE2017 challenge task4
Yong Xu, Qiuqiang Kong, Wenwu Wang, Mark D. Plumbley

TL;DR
This paper presents a system for the DCASE2017 challenge task 4, utilizing CNN and GRU neural networks with novel gating and attention mechanisms, achieving significant improvements over baseline methods in sound event detection and audio tagging.
Contribution
The paper introduces a learnable gating activation function, an attention-based localization scheme, and a batch-level balancing strategy for weakly labeled sound event detection.
Findings
Achieved 61% F-value in audio tagging
Achieved 0.73 error rate in sound event detection
Outperformed baseline MLP system significantly
Abstract
In this technique report, we present a bunch of methods for the task 4 of Detection and Classification of Acoustic Scenes and Events 2017 (DCASE2017) challenge. This task evaluates systems for the large-scale detection of sound events using weakly labeled training data. The data are YouTube video excerpts focusing on transportation and warnings due to their industry applications. There are two tasks, audio tagging and sound event detection from weakly labeled data. Convolutional neural network (CNN) and gated recurrent unit (GRU) based recurrent neural network (RNN) are adopted as our basic framework. We proposed a learnable gating activation function for selecting informative local features. Attention-based scheme is used for localizing the specific events in a weakly-supervised mode. A new batch-level balancing strategy is also proposed to tackle the data unbalancing problem. Fusion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Physics and Python Applications · Time Series Analysis and Forecasting · Anomaly Detection Techniques and Applications
