FlashSR: One-step Versatile Audio Super-resolution via Diffusion Distillation
Jaekwon Im, Juhan Nam

TL;DR
FlashSR is a fast, single-step diffusion model that significantly improves versatile audio super-resolution, restoring high-frequency components efficiently across various audio domains.
Contribution
The paper introduces FlashSR, a novel one-step diffusion-based approach with diffusion distillation, achieving rapid inference for high-quality 48kHz audio super-resolution.
Findings
Achieves 22x faster inference than previous diffusion models.
Maintains competitive performance in objective and subjective evaluations.
Introduces SR Vocoder for enhanced mel-spectrogram-based super-resolution.
Abstract
Versatile audio super-resolution (SR) is the challenging task of restoring high-frequency components from low-resolution audio with sampling rates between 4kHz and 32kHz in various domains such as music, speech, and sound effects. Previous diffusion-based SR methods suffer from slow inference due to the need for a large number of sampling steps. In this paper, we introduce FlashSR, a single-step diffusion model for versatile audio super-resolution aimed at producing 48kHz audio. FlashSR achieves fast inference by utilizing diffusion distillation with three objectives: distillation loss, adversarial loss, and distribution-matching distillation loss. We further enhance performance by proposing the SR Vocoder, which is specifically designed for SR models operating on mel-spectrograms. FlashSR demonstrates competitive performance with the current state-of-the-art model in both objective and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Signal Denoising Methods · Advanced Image Processing Techniques · Digital Filter Design and Implementation
MethodsDiffusion
