Comparing Time and Frequency Domain for Audio Event Recognition Using   Deep Learning

Lars Hertel; Huy Phan; Alfred Mertins

arXiv:1603.05824·cs.NE·March 21, 2016

Comparing Time and Frequency Domain for Audio Event Recognition Using Deep Learning

Lars Hertel, Huy Phan, Alfred Mertins

PDF

2 Datasets

TL;DR

This study compares deep learning-based audio event recognition in time and frequency domains, finding frequency domain features more effective and that convolutional layers enhance performance, achieving state-of-the-art results.

Contribution

It demonstrates that frequency domain features outperform time domain features for deep learning audio recognition and highlights the benefit of convolutional layers.

Findings

01

Frequency domain features lead to better recognition accuracy.

02

Convolutional and pooling layers significantly improve performance.

03

Achieved state-of-the-art results on benchmark datasets.

Abstract

Recognizing acoustic events is an intricate problem for a machine and an emerging field of research. Deep neural networks achieve convincing results and are currently the state-of-the-art approach for many tasks. One advantage is their implicit feature learning, opposite to an explicit feature extraction of the input signal. In this work, we analyzed whether more discriminative features can be learned from either the time-domain or the frequency-domain representation of the audio signal. For this purpose, we trained multiple deep networks with different architectures on the Freiburg-106 and ESC-10 datasets. Our results show that feature learning from the frequency domain is superior to the time domain. Moreover, additionally using convolution and pooling layers, to explore local structures of the audio signal, significantly improves the recognition performance and achieves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsConvolution