Declipping of Speech Signals Using Frequency Selective Extrapolation
Markus Jonscher, J\"urgen Seiler, Andr\'e Kaup

TL;DR
This paper adapts Frequency Selective Extrapolation, a technique from image processing, to effectively reconstruct clipped speech signals, improving audio quality by reducing distortion and noise.
Contribution
It introduces a novel application of FSE for speech declipping, demonstrating its effectiveness compared to existing methods.
Findings
Maximum SNR gain of 3.5 dB
Average SNR gain of 1 dB
Effective reconstruction across various speech datasets
Abstract
The reconstruction of clipped speech signals is an important task in audio signal processing to achieve an enhanced audio quality for further processing. In this paper, Frequency Selective Extrapolation (FSE), which is commonly used for error concealment or the reconstruction of incomplete image data, is adapted to be able to restore audio signals which are distorted from clipping. For this, FSE generates a model of the signal as an iterative superposition of Fourier basis functions. Clipped samples can then be replaced by estimated samples from the model. The performance of the proposed algorithm is evaluated by using different speech test data sets. Compared to other state-of-the-art declipping algorithms, this leads to a maximum gain in SNR of up to 3:5 dB and an average gain of 1 dB.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Advanced Adaptive Filtering Techniques · Speech Recognition and Synthesis
