Generating Mandarin and Cantonese F0 Contours with Decision Trees and BLSTMs
Weidong Yuan, Alan W Black

TL;DR
This paper introduces new decision tree and deep neural network models, including the Additive-BLSTM, for generating accurate F0 contours in Mandarin and Cantonese speech, with both objective and subjective evaluations showing improved performance.
Contribution
The paper proposes the Additive-BLSTM model and demonstrates its superiority over traditional decision tree methods for F0 contour prediction in tonal languages.
Findings
Additive-BLSTM outperforms decision tree models in objective measures.
Tone-dependent trees with normalization improve decision tree performance.
Subjective tests favor the Additive-BLSTM model.
Abstract
This paper models the fundamental frequency contours on both Mandarin and Cantonese speech with decision trees and DNNs (deep neural networks). Different kinds of f0 representations and model architectures are tested for decision trees and DNNs. A new model called Additive-BLSTM (additive bidirectional long short term memory) that predicts a base f0 contour and a residual f0 contour with two BLSTMs is proposed. With respect to objective measures of RMSE and correlation, applying tone-dependent trees together with sample normalization and delta feature regularization within decision tree framework performs best. While the new Additive-BLSTM model with delta feature regularization performs even better. Subjective listening tests on both Mandarin and Cantonese comparing Random Forest model (multiple decision trees) and the Additive-BLSTM model were also held and confirmed the advantage of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Phonetics and Phonology Research
