Long-frame-shift Neural Speech Phase Prediction with Spectral Continuity Enhancement and Interpolation Error Compensation
Yang Ai, Ye-Xin Lu, Zhen-Hua Ling

TL;DR
This paper introduces a novel neural speech phase prediction method capable of accurately predicting phase spectra with long frame shifts, improving waveform reconstruction quality in speech signal processing.
Contribution
The paper presents the first long-frame-shift neural speech phase prediction method that enhances spectral continuity and compensates for interpolation errors, enabling precise long-frame-shift phase estimation.
Findings
Outperforms existing NSPP and signal-processing methods in phase prediction quality.
Effectively predicts long-frame-shift phase spectra with high accuracy.
Improves speech waveform reconstruction fidelity.
Abstract
Speech phase prediction, which is a significant research focus in the field of signal processing, aims to recover speech phase spectra from amplitude-related features. However, existing speech phase prediction methods are constrained to recovering phase spectra with short frame shifts, which are considerably smaller than the theoretical upper bound required for exact waveform reconstruction of short-time Fourier transform (STFT). To tackle this issue, we present a novel long-frame-shift neural speech phase prediction (LFS-NSPP) method which enables precise prediction of long-frame-shift phase spectra from long-frame-shift log amplitude spectra. The proposed method consists of three stages: interpolation, prediction and decimation. The short-frame-shift log amplitude spectra are first constructed from long-frame-shift ones through frequency-by-frequency interpolation to enhance the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Blind Source Separation Techniques · Image and Signal Denoising Methods
