Nonlinear predictive models computation in ADPCM schemes
Marcos Faundez-Zanuy

TL;DR
This paper enhances nonlinear speech prediction in ADPCM schemes by introducing new training approaches, including Bayesian regularization, resulting in improved SEGSNR and more stable output quality.
Contribution
It proposes novel training methods for neural network-based nonlinear predictors in ADPCM, significantly improving performance and stability over previous approaches.
Findings
Up to 1.2dB SEGSNR improvement
Reduced variance of SEGSNR between frames
More stable output quality
Abstract
Recently several papers have been published on nonlinear prediction applied to speech coding. At ICASSP98 we presented a system based on an ADPCM scheme with a nonlinear predictor based on a neural net. The most critical parameter was the training procedure in order to achieve good generalization capability and robustness against mismatch between training and testing conditions. In this paper, we propose several new approaches that improve the performance of the original system in up to 1.2dB of SEGSNR (using bayesian regularization). The variance of the SEGSNR between frames is also minimized, so the new scheme produces a more stable quality of the output.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Advanced Data Compression Techniques
