A Phase Vocoder based on Nonstationary Gabor Frames
Emil Solsb{\ae}k Ottosen, Monika D\"orfler

TL;DR
This paper introduces a novel phase vocoder algorithm utilizing nonstationary Gabor frames to improve time stretching of music signals, reducing artifacts and enhancing transient and sinusoidal component processing.
Contribution
It extends classical phase vocoder techniques with adaptive time-frequency representations and phase locking, using non-uniform NSGFs for lower redundancy and better artifact suppression.
Findings
Significantly reduces phasiness and transient smearing.
Achieves high-quality time stretching with only three times the signal's sample count.
Outperforms state-of-the-art algorithms in artifact reduction.
Abstract
We propose a new algorithm for time stretching music signals based on the theory of nonstationary Gabor frames (NSGFs). The algorithm extends the techniques of the classical phase vocoder (PV) by incorporating adaptive time-frequency (TF) representations and adaptive phase locking. The adaptive TF representations imply good time resolution for the onsets of attack transients and good frequency resolution for the sinusoidal components. We estimate the phase values only at peak channels and the remaining phases are then locked to the values of the peaks in an adaptive manner. During attack transients we keep the stretch factor equal to one and we propose a new strategy for determining which channels are relevant for reinitializing the corresponding phase values. In contrast to previously published algorithms we use a non-uniform NSGF to obtain a low redundancy of the corresponding TF…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
