Learning Filter Banks Using Deep Learning For Acoustic Signals

Shuhui Qu; Juncheng Li; Wei Dai; Samarjit Das

arXiv:1611.09526·cs.SD·November 30, 2016·2 cites

Learning Filter Banks Using Deep Learning For Acoustic Signals

Shuhui Qu, Juncheng Li, Wei Dai, Samarjit Das

PDF

Open Access

TL;DR

This paper introduces a hybrid deep learning approach that combines domain knowledge and data-driven methods to design interpretable acoustic filter banks, improving environmental sound recognition accuracy.

Contribution

The work presents a novel filter bank learning layer integrated with CNNs, enabling automatic, interpretable acoustic feature design guided by experience, and demonstrates its effectiveness on a real dataset.

Findings

01

Achieved a 2% accuracy improvement over fixed log Mel-filter banks.

02

Visualized and validated the shape of learned filter banks.

03

Proved the interpretability and effectiveness of the hybrid feature design.

Abstract

Designing appropriate features for acoustic event recognition tasks is an active field of research. Expressive features should both improve the performance of the tasks and also be interpret-able. Currently, heuristically designed features based on the domain knowledge requires tremendous effort in hand-crafting, while features extracted through deep network are difficult for human to interpret. In this work, we explore the experience guided learning method for designing acoustic features. This is a novel hybrid approach combining both domain knowledge and purely data driven feature designing. Based on the procedure of log Mel-filter banks, we design a filter bank learning layer. We concatenate this layer with a convolutional neural network (CNN) model. After training the network, the weight of the filter bank learning layer is extracted to facilitate the design of acoustic features. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis