Smule Renaissance Small: Efficient General-Purpose Vocal Restoration

Yongyi Zang; Chris Manchester; David Young; Ivan Ivanov; Jeffrey Lufkin; Martin Vladimirov; PJ Solomon; Svetoslav Kepchelev; Fei Yueh Chen; Dongting Cai; Teodor Naydenov; Randal Leistikow

arXiv:2510.21659·cs.SD·October 27, 2025

Smule Renaissance Small: Efficient General-Purpose Vocal Restoration

Yongyi Zang, Chris Manchester, David Young, Ivan Ivanov, Jeffrey Lufkin, Martin Vladimirov, PJ Solomon, Svetoslav Kepchelev, Fei Yueh Chen, Dongting Cai, Teodor Naydenov, Randal Leistikow

PDF

Open Access 1 Models 1 Datasets

TL;DR

Smule Renaissance Small (SRS) is a compact, efficient model for end-to-end vocal restoration that handles multiple degradations in real-time, outperforming baselines and matching commercial systems without speech-specific training.

Contribution

The paper introduces SRS, a novel single-stage model for vocal restoration that operates efficiently in real-time and introduces the Extreme Degradation Bench for realistic evaluation.

Findings

01

SRS outperforms a GAN baseline on DNS 5 Challenge.

02

SRS matches a flow-matching system's performance.

03

SRS surpasses open-source baselines on EDB.

Abstract

Vocal recordings on consumer devices commonly suffer from multiple concurrent degradations: noise, reverberation, band-limiting, and clipping. We present Smule Renaissance Small (SRS), a compact single-stage model that performs end-to-end vocal restoration directly in the complex STFT domain. By incorporating phase-aware losses, SRS enables large analysis windows for improved frequency resolution while achieving 10.5x real-time inference on iPhone 12 CPU at 48 kHz. On the DNS 5 Challenge blind set, despite no speech training, SRS outperforms a strong GAN baseline and closely matches a computationally expensive flow-matching system. To enable evaluation under realistic multi-degradation scenarios, we introduce the Extreme Degradation Bench (EDB): 87 singing and speech recordings captured under severe acoustic conditions. On EDB, SRS surpasses all open-source baselines on singing and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
smulelabs/Smule-Renaissance-Small
model· ♡ 7
♡ 7

Datasets

smulelabs/ExtremeDegradationBench
dataset· 71 dl
71 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Voice and Speech Disorders