DCTNet and PCANet for acoustic signal feature extraction
Yin Xian, Andrew Thompson, Xiaobai Sun, Douglas Nowacek, and Loren, Nolte

TL;DR
This paper presents DCTNet, an efficient alternative to PCANet for acoustic signal classification, leveraging DCT functions for improved time-frequency feature extraction, with demonstrated success on whale vocalization data.
Contribution
The paper introduces DCTNet as a computationally efficient approximation to PCANet, enhancing acoustic signal classification by utilizing DCT-based filterbanks for feature extraction.
Findings
DCTNet improves classification accuracy on whale vocalization data.
DCTNet effectively approximates PCANet using DCT functions.
The method relates to spectral feature representations like STFT and spectrogram.
Abstract
We introduce the use of DCTNet, an efficient approximation and alternative to PCANet, for acoustic signal classification. In PCANet, the eigenfunctions of the local sample covariance matrix (PCA) are used as filterbanks for convolution and feature extraction. When the eigenfunctions are well approximated by the Discrete Cosine Transform (DCT) functions, each layer of of PCANet and DCTNet is essentially a time-frequency representation. We relate DCTNet to spectral feature representation methods, such as the the short time Fourier transform (STFT), spectrogram and linear frequency spectral coefficients (LFSC). Experimental results on whale vocalization data show that DCTNet improves classification rate, demonstrating DCTNet's applicability to signal processing problems such as underwater acoustics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech Recognition and Synthesis · Speech and Audio Processing
MethodsDiscrete Cosine Transform
