PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit

Hui Zhang; Tian Yuan; Junkun Chen; Xintong Li; Renjie Zheng; Yuxin; Huang; Xiaojie Chen; Enlei Gong; Zeyu Chen; Xiaoguang Hu; Dianhai Yu; Yanjun; Ma; Liang Huang

arXiv:2205.12007·eess.AS·May 25, 2022

PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit

Hui Zhang, Tian Yuan, Junkun Chen, Xintong Li, Renjie Zheng, Yuxin, Huang, Xiaojie Chen, Enlei Gong, Zeyu Chen, Xiaoguang Hu, Dianhai Yu, Yanjun, Ma, Liang Huang

PDF

Open Access 2 Repos

TL;DR

PaddleSpeech is an open-source, user-friendly speech toolkit that supports multiple speech processing tasks with state-of-the-art performance, designed to facilitate research and development in speech technologies.

Contribution

It introduces a comprehensive, easy-to-use speech toolkit with a unified architecture, recipes, and pretrained models for rapid experimentation and deployment.

Findings

01

Achieves state-of-the-art results on various datasets

02

Provides a unified toolkit for multiple speech tasks

03

Enables quick reproduction of experimental results

Abstract

PaddleSpeech is an open-source all-in-one speech toolkit. It aims at facilitating the development and research of speech processing technologies by providing an easy-to-use command-line interface and a simple code structure. This paper describes the design philosophy and core architecture of PaddleSpeech to support several essential speech-to-text and text-to-speech tasks. PaddleSpeech achieves competitive or state-of-the-art performance on various speech datasets and implements the most popular methods. It also provides recipes and pretrained models to quickly reproduce the experimental results in this paper. PaddleSpeech is publicly avaiable at https://github.com/PaddlePaddle/PaddleSpeech.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and dialogue systems · Computational Physics and Python Applications