VocBench: A Neural Vocoder Benchmark for Speech Synthesis
Ehab A. AlBadawy, Andrew Gibiansky, Qing He, Jilong Wu, Ming-Ching, Chang, Siwei Lyu

TL;DR
VocBench provides a standardized benchmarking framework for neural vocoders in speech synthesis, enabling fair comparison of their performance through systematic evaluation using consistent datasets, training, and metrics.
Contribution
It introduces a comprehensive benchmarking framework that facilitates fair and systematic comparison of state-of-the-art neural vocoders in speech synthesis.
Findings
Framework effectively compares vocoders' performance.
Demonstrates differences in quality and efficacy among vocoders.
Supports both subjective and objective evaluation methods.
Abstract
Neural vocoders, used for converting the spectral representations of an audio signal to the waveforms, are a commonly used component in speech synthesis pipelines. It focuses on synthesizing waveforms from low-dimensional representation, such as Mel-Spectrograms. In recent years, different approaches have been introduced to develop such vocoders. However, it becomes more challenging to assess these new vocoders and compare their performance to previous ones. To address this problem, we present VocBench, a framework that benchmark the performance of state-of-the art neural vocoders. VocBench uses a systematic study to evaluate different neural vocoders in a shared environment that enables a fair comparison between them. In our experiments, we use the same setup for datasets, training pipeline, and evaluation metrics for all neural vocoders. We perform a subjective and objective…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
Methods1x1 Convolution · Grouped Convolution · Dilated Convolution · Residual Connection · Sigmoid Activation · *Communicated@Fast*How Do I Communicate to Expedia? · Softmax · WaveRNN · FiLM Module · WaveGrad DBlock
