Deep Transform: Time-Domain Audio Error Correction via Probabilistic   Re-Synthesis

Andrew J.R. Simpson

arXiv:1503.05849·cs.SD·March 20, 2015

Deep Transform: Time-Domain Audio Error Correction via Probabilistic Re-Synthesis

Andrew J.R. Simpson

PDF

Open Access

TL;DR

This paper introduces a deep neural network-based method called Deep Transform for probabilistic re-synthesis of degraded time-domain speech, enabling recovery of heavily corrupted audio signals.

Contribution

It presents a novel deep transform technique using convolutional neural networks for error correction in time-domain audio signals, especially under extreme degradation.

Findings

01

Successful recovery of heavily degraded speech signals

02

Effective correction of transmission errors in audio communication

03

Demonstrated potential for improving audio error correction systems

Abstract

In the process of recording, storage and transmission of time-domain audio signals, errors may be introduced that are difficult to correct in an unsupervised way. Here, we train a convolutional deep neural network to re-synthesize input time-domain speech signals at its output layer. We then use this abstract transformation, which we call a deep transform (DT), to perform probabilistic re-synthesis on further speech (of the same speaker) which has been degraded. Using the convolutive DT, we demonstrate the recovery of speech audio that has been subject to extreme degradation. This approach may be useful for correction of errors in communications devices.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis