A Time-Frequency Generative Adversarial based method for Audio Packet Loss Concealment
Carlo Aironi, Samuele Cornell, Luca Serafini, Stefano Squartini

TL;DR
This paper introduces a GAN-based method for audio packet loss concealment that improves speech quality during VoIP transmission by effectively repairing lost audio fragments using spectrogram translation.
Contribution
It proposes bin2bin, an enhanced pix2pix framework with combined loss functions and a modified discriminator, to better restore lost audio packets in real-time.
Findings
Outperforms current state-of-the-art methods in handling high packet loss.
Effectively restores large gaps in audio streams.
Reduces concealment time with improved phase reconstruction.
Abstract
Packet loss is a major cause of voice quality degradation in VoIP transmissions with serious impact on intelligibility and user experience. This paper describes a system based on a generative adversarial approach, which aims to repair the lost fragments during the transmission of audio streams. Inspired by the powerful image-to-image translation capability of Generative Adversarial Networks (GANs), we propose bin2bin, an improved pix2pix framework to achieve the translation task from magnitude spectrograms of audio frames with lost packets, to noncorrupted speech spectrograms. In order to better maintain the structural information after spectrogram translation, this paper introduces the combination of two STFT-based loss functions, mixed with the traditional GAN objective. Furthermore, we employ a modified PatchGAN structure as discriminator and we lower the concealment time by a proper…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Digital Media Forensic Detection · Image and Signal Denoising Methods
MethodsRepair · Batch Normalization · HuMan(Expedia)||How do I get a human at Expedia? · Dropout · *Communicated@Fast*How Do I Communicate to Expedia? · Sigmoid Activation · Concatenated Skip Connection · Convolution · PatchGAN · Pix2Pix
