NeurST: Neural Speech Translation Toolkit
Chengqi Zhao, Mingxuan Wang, Qianqian Dong, Rong Ye, Lei, Li

TL;DR
NeurST is an open-source toolkit designed to facilitate end-to-end neural speech translation research by providing easy-to-use features, benchmarks, and experimental results for various datasets.
Contribution
It introduces a comprehensive framework for neural speech translation, including recipes for data processing, training, and evaluation, with reliable benchmarks for future research.
Findings
Provides step-by-step recipes for speech translation tasks
Establishes reliable benchmarks for various datasets
Demonstrates competitive experimental results
Abstract
NeurST is an open-source toolkit for neural speech translation. The toolkit mainly focuses on end-to-end speech translation, which is easy to use, modify, and extend to advanced speech translation research and products. NeurST aims at facilitating the speech translation research for NLP researchers and building reliable benchmarks for this field. It provides step-by-step recipes for feature extraction, data preprocessing, distributed training, and evaluation. In this paper, we will introduce the framework design of NeurST and show experimental results for different benchmark datasets, which can be regarded as reliable baselines for future research. The toolkit is publicly available at https://github.com/bytedance/neurst/ and we will continuously update the performance of NeurST with other counterparts and studies at https://st-benchmark.github.io/.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Label Smoothing · Dropout · Byte Pair Encoding · Adam · Dense Connections · Softmax
