PostGAN: A GAN-Based Post-Processor to Enhance the Quality of Coded Speech
Srikanth Korse, Nicola Pia, Kishan Gupta, Guillaume Fuchs

TL;DR
PostGAN is a GAN-based neural post-processor designed to improve the quality of low-bitrate coded speech, outperforming existing methods by around 20 MUSHRA points, especially for Bluetooth LC3 codec.
Contribution
This paper introduces PostGAN, a novel GAN-based post-processing method operating in the sub-band domain with a U-Net architecture for speech quality enhancement.
Findings
PostGAN improves speech quality by approximately 20 MUSHRA points.
It outperforms previous data-driven post-processors.
Effective on low-bitrate Bluetooth LC3 codec.
Abstract
The quality of speech coded by transform coding is affected by various artefacts especially when bitrates to quantize the frequency components become too low. In order to mitigate these coding artefacts and enhance the quality of coded speech, a post-processor that relies on a-priori information transmitted from the encoder is traditionally employed at the decoder side. In recent years, several data-driven post-postprocessors have been proposed which were shown to outperform traditional approaches. In this paper, we propose PostGAN, a GAN-based neural post-processor that operates in the sub-band domain and relies on the U-Net architecture and a learned affine transform. It has been tested on the recently standardized low-complexity, low-delay bluetooth codec (LC3) for wideband speech at the lowest bitrate (16 kbit/s). Subjective evaluations and objective scores show that the newly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Compression Techniques · Speech and Audio Processing · Speech Recognition and Synthesis
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Convolution · Max Pooling · Concatenated Skip Connection · U-Net
