Phase Vocoder Done Right
Zdenek Prusa, Nicki Holighaus

TL;DR
This paper introduces a novel phase correction method for the phase vocoder that improves time-stretching quality by avoiding artifacts without needing peak detection or transient tracking.
Contribution
The paper presents a new phase correction technique based on phase gradient estimation that simplifies processing and enhances artifact-free time stretching in phase vocoders.
Findings
Reduces phase vocoder artifacts during extreme time stretching
Does not require peak picking or transient detection
Improves audio quality in time-scaling applications
Abstract
The phase vocoder (PV) is a widely spread technique for processing audio signals. It employs a short-time Fourier transform (STFT) analysis-modify-synthesis loop and is typically used for time-scaling of signals by means of using different time steps for STFT analysis and synthesis. The main challenge of PV used for that purpose is the correction of the STFT phase. In this paper, we introduce a novel method for phase correction based on phase gradient estimation and its integration. The method does not require explicit peak picking and tracking nor does it require detection of transients and their separate treatment. Yet, the method does not suffer from the typical phase vocoder artifacts even for extreme time stretching factors.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Digital Filter Design and Implementation · Advanced Adaptive Filtering Techniques
