Investigation of Japanese PnG BERT language model in text-to-speech synthesis for pitch accent language
Yusuke Yasuda, Tomoki Toda

TL;DR
This paper explores how PnG~BERT, a self-supervised model, can improve pitch accent accuracy in Japanese end-to-end TTS by fine-tuning strategies and additional tone prediction tasks.
Contribution
It demonstrates that PnG~BERT features enhance pitch accent rendering in Japanese TTS and proposes methods to optimize feature utilization.
Findings
PnG~BERT features aid in pitch accent inference.
Fine-tuning layers affect feature content and performance.
PnG~BERT outperforms baseline Tacotron in accent correctness.
Abstract
End-to-end text-to-speech synthesis (TTS) can generate highly natural synthetic speech from raw text. However, rendering the correct pitch accents is still a challenging problem for end-to-end TTS. To tackle the challenge of rendering correct pitch accent in Japanese end-to-end TTS, we adopt PnG~BERT, a self-supervised pretrained model in the character and phoneme domain for TTS. We investigate the effects of features captured by PnG~BERT on Japanese TTS by modifying the fine-tuning condition to determine the conditions helpful inferring pitch accents. We manipulate content of PnG~BERT features from being text-oriented to speech-oriented by changing the number of fine-tuned layers during TTS. In addition, we teach PnG~BERT pitch accent information by fine-tuning with tone prediction as an additional downstream task. Our experimental results show that the features of PnG~BERT captured by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Sigmoid Activation · Highway Layer · Bidirectional GRU · Convolution · Residual Connection · Max Pooling · Tanh Activation · [LivE@PeRson]How do I talk to a real person at Expedia? · Highway Network
