Deep Convolutional Neural Network with Mixup for Environmental Sound   Classification

Zhichao Zhang; Shugong Xu; Shan Cao; and Shunqing Zhang

arXiv:1808.08405·cs.SD·August 28, 2018·5 cites

Deep Convolutional Neural Network with Mixup for Environmental Sound Classification

Zhichao Zhang, Shugong Xu, Shan Cao, and Shunqing Zhang

PDF

Open Access

TL;DR

This paper introduces a novel deep convolutional neural network with mixup data augmentation for environmental sound classification, achieving state-of-the-art results on UrbanSound8K and competitive performance on other datasets.

Contribution

The paper proposes a new CNN architecture combined with mixup augmentation specifically tailored for environmental sound classification tasks.

Findings

01

Achieved 83.7% accuracy on UrbanSound8K

02

Demonstrated the effectiveness of mixup in improving classification performance

03

Provided competitive results on ESC-50 and ESC-10 datasets

Abstract

Environmental sound classification (ESC) is an important and challenging problem. In contrast to speech, sound events have noise-like nature and may be produced by a wide variety of sources. In this paper, we propose to use a novel deep convolutional neural network for ESC tasks. Our network architecture uses stacked convolutional and pooling layers to extract high-level feature representations from spectrogram-like features. Furthermore, we apply mixup to ESC tasks and explore its impacts on classification performance and feature distribution. Experiments were conducted on UrbanSound8K, ESC-50 and ESC-10 datasets. Our experimental results demonstrated that our ESC system has achieved the state-of-the-art performance (83.7%) on UrbanSound8K and competitive performance on ESC-50 and ESC-10.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Animal Vocal Communication and Behavior