A Survey of Advancing Audio Super-Resolution and Bandwidth Extension from Discriminative to Generative Models

Ningyuan Yang; Yize Li; Diego A. Cuji; Ryan M. Corey; Pu Zhao; Xue Lin; Andrew C. Singer

arXiv:2605.16681·eess.AS·May 20, 2026

A Survey of Advancing Audio Super-Resolution and Bandwidth Extension from Discriminative to Generative Models

Ningyuan Yang, Yize Li, Diego A. Cuji, Ryan M. Corey, Pu Zhao, Xue Lin, Andrew C. Singer

PDF

TL;DR

This survey reviews the evolution of audio super-resolution from traditional discriminative models to modern generative approaches, highlighting key techniques, challenges, and future directions.

Contribution

It provides a comprehensive taxonomy and unified perspective on generative models for audio bandwidth extension and super-resolution, guiding future research.

Findings

01

Generative models improve perceptual quality over discriminative methods.

02

Key design trade-offs include fidelity, robustness, and computational efficiency.

03

Emerging directions involve large language models and multimodal foundation models.

Abstract

Audio super-resolution (SR), also referred to as bandwidth extension (BWE), aims to reconstruct high-fidelity signals from low-resolution (LR) or band-limited (BL) observations, an inherently ill-posed task due to the ambiguity of missing high-frequency (HF) content. This survey provides a comprehensive overview of the field, with a particular focus on the paradigm shift from discriminative mapping to modern generative modeling. We first review early discriminative deep neural network (DNN) models, which formulate BWE/SR as a deterministic mapping problem and are prone to regression-to-the-mean effects and spectral over-smoothing. We then systematically review generative approaches, including autoregressive (AR) models, variational autoencoders (VAEs), generative adversarial networks (GANs), diffusion and score-based models, flow-based methods, and Schr\"odinger bridges. Across these…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.