FastWave: Optimized Diffusion Model for Audio Super-Resolution

Nikita Kuznetsov; Maksim Kaledin

arXiv:2603.04122·cs.SD·March 5, 2026

FastWave: Optimized Diffusion Model for Audio Super-Resolution

Nikita Kuznetsov, Maksim Kaledin

PDF

Open Access

TL;DR

FastWave is a computationally efficient diffusion-based model for audio super-resolution that outperforms some existing methods and is comparable to state-of-the-art, with significantly reduced training and inference costs.

Contribution

The paper introduces FastWave, a diffusion model for audio super-resolution that is faster and requires fewer resources than existing diffusion and flow models.

Findings

01

Outperforms NU-Wave 2 in quality

02

Comparable to state-of-the-art models

03

Requires only 50 GFLOPs and 1.3 million parameters

Abstract

Audio Super-Resolution is a set of techniques aimed at high-quality estimation of the given signal as if it would be sampled with higher sample rate. Among suggested methods there are diffusion and flow models (which are considered slower), generative adversarial networks (which are considered faster), however both approaches are currently presented by high-parametric networks, requiring high computational costs both for training and inference. We propose a solution to both these problems by re-considering the recent advances in the training of diffusion models and applying them to super-resolution from any to 48 kHz sample rate. Our approach shows better results than NU-Wave 2 and is comparable to state-of-the-art models. Our model called FastWave has around 50 GFLOPs of computational complexity and 1.3 M parameters and can be trained with less resources and significantly faster than…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Hearing Loss and Rehabilitation · Advanced Image Processing Techniques