Speech Prediction using an Adaptive Recurrent Neural Network with   Application to Packet Loss Concealment

Reza Lotfidereshgi; Philippe Gournay

arXiv:2111.08116·eess.AS·November 17, 2021·ICASSP

Speech Prediction using an Adaptive Recurrent Neural Network with Application to Packet Loss Concealment

Reza Lotfidereshgi, Philippe Gournay

PDF

TL;DR

This paper introduces an online-trained recurrent neural network for direct speech sample prediction, improving packet loss concealment by outperforming traditional methods.

Contribution

It presents a novel RNN-based speech predictor trained online on recent speech samples, enhancing PLC performance over classical techniques.

Findings

01

Outperforms ITU G.711 Appendix I PLC in tests

02

Operates directly on speech samples, not features

03

Can be pre-trained offline for faster convergence

Abstract

This paper proposes a novel approach for speech signal prediction based on a recurrent neural network (RNN). Unlike existing RNN-based predictors, which operate on parametric features and are trained offline on a large collection of such features, the proposed predictor operates directly on speech samples and is trained online on the recent past of the speech signal. Optionally, the network can be pre-trained offline to speed-up convergence at start-up. The proposed predictor is a single end-to-end network that captures all sorts of dependencies between samples, and therefore has the potential to outperform classical linear/non-linear and short-term/long-term speech predictor structures. We apply it to the packet loss concealment (PLC) problem and show that it outperforms the standard ITU G.711 Appendix I PLC technique.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.