A Joint Framework for Audio Tagging and Weakly Supervised Acoustic Event Detection Using DenseNet with Global Average Pooling
Chieh-Chi Kao, Bowen Shi, Ming Sun, Chao Wang

TL;DR
This paper introduces a DenseNet-based neural network with global average pooling for audio tagging and weakly supervised acoustic event detection, achieving state-of-the-art results without recurrent layers.
Contribution
The novel framework uses DenseNet with GAP for direct event localization, outperforming previous attention-based models in weakly supervised AED.
Findings
Outperforms state-of-the-art in DCASE 2017 audio tagging by 5.3% F1 score.
Achieves 8.1% improvement in event-based F1 in DCASE 2018 AED.
Effective use of data augmentation and tri-training enhances performance.
Abstract
This paper proposes a network architecture mainly designed for audio tagging, which can also be used for weakly supervised acoustic event detection (AED). The proposed network consists of a modified DenseNet as the feature extractor, and a global average pooling (GAP) layer to predict frame-level labels at inference time. This architecture is inspired by the work proposed by Zhou et al., a well-known framework using GAP to localize visual objects given image-level labels. While most of the previous works on weakly supervised AED used recurrent layers with attention-based mechanism to localize acoustic events, the proposed network directly localizes events using the feature map extracted by DenseNet without any recurrent layers. In the audio tagging task of DCASE 2017, our method significantly outperforms the state-of-the-art method in F1 score by 5.3% on the dev set, and 6.0% on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Video Analysis and Summarization
