Environmental Sound Recognition using Masked Conditional Neural Networks

Fady Medhat; David Chesmore; John Robinson

arXiv:1804.02665·cs.LG·April 12, 2019

Environmental Sound Recognition using Masked Conditional Neural Networks

Fady Medhat, David Chesmore, John Robinson

PDF

1 Repo

TL;DR

This paper introduces the Masked Conditional Neural Network (MCLNN), a novel neural architecture tailored for environmental sound recognition that leverages frequency band learning and automates feature combination exploration.

Contribution

The paper presents MCLNN, a new neural network model that incorporates filterbank behavior and relational properties for improved sound recognition.

Findings

01

MCLNN achieved competitive accuracy on ESC-10 dataset.

02

MCLNN outperformed traditional neural networks in environmental sound classification.

03

The approach automates feature selection, reducing manual effort.

Abstract

Neural network based architectures used for sound recognition are usually adapted from other application domains, which may not harness sound related properties. The ConditionaL Neural Network (CLNN) is designed to consider the relational properties across frames in a temporal signal, and its extension the Masked ConditionaL Neural Network (MCLNN) embeds a filterbank behavior within the network, which enforces the network to learn in frequency bands rather than bins. Additionally, it automates the exploration of different feature combinations analogous to handcrafting the optimum combination of features for a recognition task. We applied the MCLNN to the environmental sounds of the ESC-10 dataset. The MCLNN achieved competitive accuracies compared to state-of-the-art convolutional neural networks and hand-crafted attempts.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fadymedhat/MCLNN
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.