A Polyphone BERT for Polyphone Disambiguation in Mandarin Chinese
Song Zhang, Ken Zheng, Xiaoxu Zhu, Baoxiang Li

TL;DR
This paper introduces a specialized BERT model for Mandarin Chinese polyphone disambiguation, significantly improving pronunciation prediction accuracy in G2P conversion for TTS systems.
Contribution
It extends pre-trained Chinese BERT with new tokens for monophonic characters, turning disambiguation into a pre-training task, and achieves state-of-the-art accuracy.
Findings
Achieved 94.1% accuracy, a 2% improvement over previous models.
Created 741 new monophonic characters for model training.
Demonstrated effectiveness of the polyphone BERT in disambiguation tasks.
Abstract
Grapheme-to-phoneme (G2P) conversion is an indispensable part of the Chinese Mandarin text-to-speech (TTS) system, and the core of G2P conversion is to solve the problem of polyphone disambiguation, which is to pick up the correct pronunciation for several candidates for a Chinese polyphonic character. In this paper, we propose a Chinese polyphone BERT model to predict the pronunciations of Chinese polyphonic characters. Firstly, we create 741 new Chinese monophonic characters from 354 source Chinese polyphonic characters by pronunciation. Then we get a Chinese polyphone BERT by extending a pre-trained Chinese BERT with 741 new Chinese monophonic characters and adding a corresponding embedding layer for new tokens, which is initialized by the embeddings of source Chinese polyphonic characters. In this way, we can turn the polyphone disambiguation task into a pre-training task of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Topic Modeling
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Weight Decay · Layer Normalization · Refunds@Expedia|||How do I get a full refund from Expedia? · Linear Warmup With Linear Decay · Residual Connection · WordPiece · Dropout
