Integrating the Data Augmentation Scheme with Various Classifiers for   Acoustic Scene Modeling

Hangting Chen; Zuozhen Liu; Zongming Liu; Pengyuan Zhang; Yonghong Yan

arXiv:1907.06639·eess.AS·July 17, 2019·67 cites

Integrating the Data Augmentation Scheme with Various Classifiers for Acoustic Scene Modeling

Hangting Chen, Zuozhen Liu, Zongming Liu, Pengyuan Zhang, Yonghong Yan

PDF

Open Access

TL;DR

This paper presents an acoustic scene classification system that integrates data augmentation using generative adversarial networks with multiple classifiers, achieving over 85% accuracy on the DCASE2019 challenge dataset.

Contribution

It introduces a novel combination of data augmentation with GANs and various classifiers, including deep CNNs with scalogram and Mel features, for improved acoustic scene modeling.

Findings

01

Achieved over 85% accuracy on evaluation dataset.

02

Demonstrated effectiveness of GAN-based data augmentation.

03

Enhanced classification performance through classifier fusion.

Abstract

This technical report describes the IOA team's submission for TASK1A of DCASE2019 challenge. Our acoustic scene classification (ASC) system adopts a data augmentation scheme employing generative adversary networks. Two major classifiers, 1D deep convolutional neural network integrated with scalogram features and 2D fully convolutional neural network integrated with Mel filter bank features, are deployed in the scheme. Other approaches, such as adversary city adaptation, temporal module based on discrete cosine transform and hybrid architectures, have been developed for further fusion. The results of our experiments indicates that the final fusion systems A-D could achieve an accuracy higher than 85% on the officially provided fold 1 evaluation dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis

MethodsDiscrete Cosine Transform