VibE-SVC: Vibrato Extraction with High-frequency F0 Contour for Singing Voice Conversion

Joon-Seung Choi; Dong-Min Byun; Hyung-Seok Oh; Seong-Whan Lee

arXiv:2505.20794·cs.SD·October 7, 2025

VibE-SVC: Vibrato Extraction with High-frequency F0 Contour for Singing Voice Conversion

Joon-Seung Choi, Dong-Min Byun, Hyung-Seok Oh, Seong-Whan Lee

PDF

TL;DR

VibE-SVC introduces a novel vibrato extraction method using wavelet transform for improved controllability in singing voice conversion, leading to more expressive and natural singing synthesis.

Contribution

The paper presents a new vibrato modeling approach that explicitly extracts vibrato features, enabling precise control and transfer in singing voice conversion.

Findings

01

Effective vibrato transfer demonstrated in experiments

02

High-quality singing voice conversion with preserved speaker identity

03

Enhanced expressiveness through vibrato manipulation

Abstract

Controlling singing style is crucial for achieving an expressive and natural singing voice. Among the various style factors, vibrato plays a key role in conveying emotions and enhancing musical depth. However, modeling vibrato remains challenging due to its dynamic nature, making it difficult to control in singing voice conversion. To address this, we propose VibESVC, a controllable singing voice conversion model that explicitly extracts and manipulates vibrato using discrete wavelet transform. Unlike previous methods that model vibrato implicitly, our approach decomposes the F0 contour into frequency components, enabling precise transfer. This allows vibrato control for enhanced flexibility. Experimental results show that VibE-SVC effectively transforms singing styles while preserving speaker similarity. Both subjective and objective evaluations confirm high-quality conversion.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.