AudioSR: Versatile Audio Super-resolution at Scale

Haohe Liu; Ke Chen; Qiao Tian; Wenwu Wang; Mark D. Plumbley

arXiv:2309.07314·cs.SD·September 15, 2023·1 cites

AudioSR: Versatile Audio Super-resolution at Scale

Haohe Liu, Ke Chen, Qiao Tian, Wenwu Wang, Mark D. Plumbley

PDF

Open Access 1 Repo 3 Models

TL;DR

AudioSR is a versatile diffusion-based model that significantly improves audio super-resolution across various audio types and bandwidths, outperforming previous methods and enhancing generative audio models.

Contribution

Introduces AudioSR, a diffusion-based model capable of robust audio super-resolution from 2kHz-16kHz to 24kHz, covering diverse audio types and bandwidths.

Findings

01

Achieves strong objective results on multiple benchmarks.

02

Enhances quality of generative audio models like AudioLDM, Fastspeech2, MusicGen.

03

Demonstrates versatility across sound effects, music, and speech.

Abstract

Audio super-resolution is a fundamental task that predicts high-frequency components for low-resolution audio, enhancing audio quality in digital applications. Previous methods have limitations such as the limited scope of audio types (e.g., music, speech) and specific bandwidth settings they can handle (e.g., 4kHz to 8kHz). In this paper, we introduce a diffusion-based generative model, AudioSR, that is capable of performing robust audio super-resolution on versatile audio types, including sound effects, music, and speech. Specifically, AudioSR can upsample any input audio signal within the bandwidth range of 2kHz to 16kHz to a high-resolution audio signal at 24kHz bandwidth with a sampling rate of 48kHz. Extensive objective evaluation on various audio super-resolution benchmarks demonstrates the strong result achieved by the proposed model. In addition, our subjective evaluation shows…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

haoheliu/versatile_audio_super_resolution
pytorch

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Hearing Loss and Rehabilitation · Acoustic Wave Phenomena Research