MLP Singer: Towards Rapid Parallel Korean Singing Voice Synthesis

Jaesung Tae; Hyeongju Kim; Younggun Lee

arXiv:2106.07886·cs.SD·November 23, 2021

MLP Singer: Towards Rapid Parallel Korean Singing Voice Synthesis

Jaesung Tae, Hyeongju Kim, Younggun Lee

PDF

1 Repo

TL;DR

MLP Singer introduces an MLP-based parallel Korean singing voice synthesis system that significantly improves inference speed and audio quality, outperforming autoregressive models and enabling real-time synthesis on CPUs and GPUs.

Contribution

This work is the first to utilize an entirely MLP-based architecture for voice synthesis, achieving rapid, high-quality parallel singing voice generation.

Findings

01

Outperforms autoregressive GAN-based systems in quality and speed

02

Achieves real-time synthesis with up to 200x (CPU) and 3400x (GPU) speedup

03

Demonstrates the effectiveness of MLP architecture in singing voice synthesis

Abstract

Recent developments in deep learning have significantly improved the quality of synthesized singing voice audio. However, prominent neural singing voice synthesis systems suffer from slow inference speed due to their autoregressive design. Inspired by MLP-Mixer, a novel architecture introduced in the vision literature for attention-free image classification, we propose MLP Singer, a parallel Korean singing voice synthesis system. To the best of our knowledge, this is the first work that uses an entirely MLP-based architecture for voice synthesis. Listening tests demonstrate that MLP Singer outperforms a larger autoregressive GAN-based system, both in terms of audio quality and synthesis speed. In particular, MLP Singer achieves a real-time factor of up to 200 and 3400 on CPUs and GPUs respectively, enabling order of magnitude faster generation on both environments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

neosapience/mlp-singer
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.