PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit
Hui Zhang, Tian Yuan, Junkun Chen, Xintong Li, Renjie Zheng, Yuxin, Huang, Xiaojie Chen, Enlei Gong, Zeyu Chen, Xiaoguang Hu, Dianhai Yu, Yanjun, Ma, Liang Huang

TL;DR
PaddleSpeech is an open-source, user-friendly speech toolkit that supports multiple speech processing tasks with state-of-the-art performance, designed to facilitate research and development in speech technologies.
Contribution
It introduces a comprehensive, easy-to-use speech toolkit with a unified architecture, recipes, and pretrained models for rapid experimentation and deployment.
Findings
Achieves state-of-the-art results on various datasets
Provides a unified toolkit for multiple speech tasks
Enables quick reproduction of experimental results
Abstract
PaddleSpeech is an open-source all-in-one speech toolkit. It aims at facilitating the development and research of speech processing technologies by providing an easy-to-use command-line interface and a simple code structure. This paper describes the design philosophy and core architecture of PaddleSpeech to support several essential speech-to-text and text-to-speech tasks. PaddleSpeech achieves competitive or state-of-the-art performance on various speech datasets and implements the most popular methods. It also provides recipes and pretrained models to quickly reproduce the experimental results in this paper. PaddleSpeech is publicly avaiable at https://github.com/PaddlePaddle/PaddleSpeech.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and dialogue systems · Computational Physics and Python Applications
