DCTNet and PCANet for acoustic signal feature extraction

Yin Xian; Andrew Thompson; Xiaobai Sun; Douglas Nowacek; and Loren; Nolte

arXiv:1605.01755·cs.SD·May 9, 2016·2 cites

DCTNet and PCANet for acoustic signal feature extraction

Yin Xian, Andrew Thompson, Xiaobai Sun, Douglas Nowacek, and Loren, Nolte

PDF

Open Access 1 Repo

TL;DR

This paper presents DCTNet, an efficient alternative to PCANet for acoustic signal classification, leveraging DCT functions for improved time-frequency feature extraction, with demonstrated success on whale vocalization data.

Contribution

The paper introduces DCTNet as a computationally efficient approximation to PCANet, enhancing acoustic signal classification by utilizing DCT-based filterbanks for feature extraction.

Findings

01

DCTNet improves classification accuracy on whale vocalization data.

02

DCTNet effectively approximates PCANet using DCT functions.

03

The method relates to spectral feature representations like STFT and spectrogram.

Abstract

We introduce the use of DCTNet, an efficient approximation and alternative to PCANet, for acoustic signal classification. In PCANet, the eigenfunctions of the local sample covariance matrix (PCA) are used as filterbanks for convolution and feature extraction. When the eigenfunctions are well approximated by the Discrete Cosine Transform (DCT) functions, each layer of of PCANet and DCTNet is essentially a time-frequency representation. We relate DCTNet to spectral feature representation methods, such as the the short time Fourier transform (STFT), spectrogram and linear frequency spectral coefficients (LFSC). Experimental results on whale vocalization data show that DCTNet improves classification rate, demonstrating DCTNet's applicability to signal processing problems such as underwater acoustics.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

poline3939/DCTNet
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech Recognition and Synthesis · Speech and Audio Processing

MethodsDiscrete Cosine Transform