DCASE 2018 Challenge: Solution for Task 5

Jeremy Chew; Yingxiang Sun; Lahiru Jayasinghe; Chau Yuen

arXiv:1812.04618·eess.AS·December 13, 2018

DCASE 2018 Challenge: Solution for Task 5

Jeremy Chew, Yingxiang Sun, Lahiru Jayasinghe, Chau Yuen

PDF

Open Access

TL;DR

This paper presents an ensemble learning system combining CNN and LSTM models for classifying domestic activities, achieving high accuracy and outperforming baseline methods in the DCASE 2018 challenge.

Contribution

The paper introduces a novel ensemble approach using CNN and LSTM models with multi-channel features for improved activity classification.

Findings

01

F1-score of 92.19% on the development dataset

02

7.69% improvement over baseline performance

03

Effective classification of domestic activities

Abstract

To address Task 5 in the Detection and Classification of Acoustic Scenes and Events (DCASE) 2018 challenge, in this paper, we propose an ensemble learning system. The proposed system consists of three different models, based on convolutional neural network and long short memory recurrent neural network. With extracted features such as spectrogram and mel-frequency cepstrum coefficients from different channels, the proposed system can classify different domestic activities effectively. Experimental results obtained from the provided development dataset show that good performance with F1-score of 92.19% can be achieved. Compared with the baseline system, our proposed system significantly improves the performance of F1-score by 7.69%.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis