Masked Conditional Neural Networks for Environmental Sound   Classification

Fady Medhat; David Chesmore; John Robinson

arXiv:1805.10004·cs.LG·April 30, 2019

Masked Conditional Neural Networks for Environmental Sound Classification

Fady Medhat, David Chesmore, John Robinson

PDF

2 Repos

TL;DR

This paper introduces the Masked Conditional Neural Network (MCLNN), a model that learns frequency band features for environmental sound classification, achieving competitive results with fewer parameters and exploring the impact of tonal properties on classification.

Contribution

The paper proposes MCLNN, a novel neural network architecture that incorporates frequency band learning via masking, and evaluates its effectiveness on environmental sound datasets including Urbansound8k and YorNoise.

Findings

01

MCLNN achieves competitive accuracy with 12% of the parameters of state-of-the-art CNNs.

02

Masking automates feature combination exploration, reducing manual feature engineering.

03

Tonal properties influence classification performance, as shown on YorNoise dataset.

Abstract

The ConditionaL Neural Network (CLNN) exploits the nature of the temporal sequencing of the sound signal represented in a spectrogram, and its variant the Masked ConditionaL Neural Network (MCLNN) induces the network to learn in frequency bands by embedding a filterbank-like sparseness over the network's links using a binary mask. Additionally, the masking automates the exploration of different feature combinations concurrently analogous to handcrafting the optimum combination of features for a recognition task. We have evaluated the MCLNN performance using the Urbansound8k dataset of environmental sounds. Additionally, we present a collection of manually recorded sounds for rail and road traffic, YorNoise, to investigate the confusion rates among machine generated sounds possessing low-frequency components. MCLNN has achieved competitive results without augmentation and using 12% of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.