Deep Transform: Time-Domain Audio Error Correction via Probabilistic Re-Synthesis
Andrew J.R. Simpson

TL;DR
This paper introduces a deep neural network-based method called Deep Transform for probabilistic re-synthesis of degraded time-domain speech, enabling recovery of heavily corrupted audio signals.
Contribution
It presents a novel deep transform technique using convolutional neural networks for error correction in time-domain audio signals, especially under extreme degradation.
Findings
Successful recovery of heavily degraded speech signals
Effective correction of transmission errors in audio communication
Demonstrated potential for improving audio error correction systems
Abstract
In the process of recording, storage and transmission of time-domain audio signals, errors may be introduced that are difficult to correct in an unsupervised way. Here, we train a convolutional deep neural network to re-synthesize input time-domain speech signals at its output layer. We then use this abstract transformation, which we call a deep transform (DT), to perform probabilistic re-synthesis on further speech (of the same speaker) which has been degraded. Using the convolutive DT, we demonstrate the recovery of speech audio that has been subject to extreme degradation. This approach may be useful for correction of errors in communications devices.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis
