Audio Signal Processing Using Time Domain Mel-Frequency Wavelet Coefficient
Rinku Sebastian, Simon O'Keefe, Martin Trefzer

TL;DR
This paper introduces a novel time domain Mel frequency wavelet coefficient method that combines wavelet transform and Mel scale features to improve speech signal analysis efficiency and accuracy.
Contribution
It proposes a new feature extraction technique that reduces computational complexity by integrating wavelet and Mel scale features directly in the time domain.
Findings
Enhanced speech feature representation combining wavelet and Mel scale.
Reduced computational load compared to traditional wavelet-based methods.
Improved efficiency in audio signal processing when used with reservoir computing.
Abstract
Extracting features from the speech is the most critical process in speech signal processing. Mel Frequency Cepstral Coefficients (MFCC) are the most widely used features in the majority of the speaker and speech recognition applications, as the filtering in this feature is similar to the filtering taking place in the human ear. But the main drawback of this feature is that it provides only the frequency information of the signal but does not provide the information about at what time which frequency is present. The wavelet transform, with its flexible time-frequency window, provides time and frequency information of the signal and is an appropriate tool for the analysis of non-stationary signals like speech. On the other hand, because of its uniform frequency scaling, a typical wavelet transform may be less effective in analysing speech signals, have poorer frequency resolution in low…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
