Stochastic Restoration of Heavily Compressed Musical Audio using Generative Adversarial Networks
Stefan Lattner, Javier Nistal

TL;DR
This paper explores using a stochastic GAN-based approach to restore heavily compressed musical audio, demonstrating improved quality over standard MP3 compression at low bitrates and showing the stochastic model's advantage over deterministic methods.
Contribution
It introduces a stochastic GAN architecture for musical audio restoration, providing a novel approach to improve quality of heavily compressed audio signals.
Findings
Models improve audio quality over MP3 at 16 and 32 kbit/s.
Stochastic generators produce outputs closer to original signals than deterministic ones.
Extensive evaluation with objective metrics and listening tests supports the effectiveness.
Abstract
Lossy audio codecs compress (and decompress) digital audio streams by removing information that tends to be inaudible in human perception. Under high compression rates, such codecs may introduce a variety of impairments in the audio signal. Many works have tackled the problem of audio enhancement and compression artifact removal using deep learning techniques. However, only a few works tackle the restoration of heavily compressed audio signals in the musical domain. In such a scenario, there is no unique solution for the restoration of the original signal. Therefore, in this study, we test a stochastic generator of a Generative Adversarial Network (GAN) architecture for this task. Such a stochastic generator, conditioned on highly compressed musical audio signals, could one day generate outputs indistinguishable from high-quality releases. Therefore, the present study may yield insights…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
