Singing voice correction using canonical time warping

Yin-Jyun Luo; Ming-Tso Chen; Tai-Shih Chi; Li Su

arXiv:1711.08600·eess.AS·November 27, 2017·ICASSP

Singing voice correction using canonical time warping

Yin-Jyun Luo, Ming-Tso Chen, Tai-Shih Chi, Li Su

PDF

Open Access

TL;DR

This paper introduces a canonical time warping method for singing voice correction that aligns amateur recordings to professional ones, resulting in improved pitch accuracy and naturalness, outperforming existing techniques.

Contribution

The paper presents a novel application of canonical time warping for singing voice correction, demonstrating robustness and superior performance over traditional methods.

Findings

01

CTW is robust against pitch-shifting and time-stretching effects.

02

Subjective tests show CTW outperforms DTW and auto-tuning software.

03

The method is applicable in real-world singing correction scenarios.

Abstract

Expressive singing voice correction is an appealing but challenging problem. A robust time-warping algorithm which synchronizes two singing recordings can provide a promising solution. We thereby propose to address the problem by canonical time warping (CTW) which aligns amateur singing recordings to professional ones. A new pitch contour is generated given the alignment information, and a pitch-corrected singing is synthesized back through the vocoder. The objective evaluation shows that CTW is robust against pitch-shifting and time-stretching effects, and the subjective test demonstrates that CTW prevails the other methods including DTW and the commercial auto-tuning software. Finally, we demonstrate the applicability of the proposed method in a practical, real-world scenario.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Time Series Analysis and Forecasting · Speech Recognition and Synthesis