A Character-level Span-based Model for Mandarin Prosodic Structure Prediction
Xueyuan Chen, Changhe Song, Yixuan Zhou, Zhiyong Wu, Changbin Chen,, Zhongqin Wu, Helen Meng

TL;DR
This paper introduces a span-based, end-to-end model for Mandarin prosodic structure prediction that leverages character-level BERT and a CKY-style algorithm to improve accuracy without relying on word segmentation.
Contribution
The proposed model predicts prosodic structures directly from Chinese characters using span representations and a CKY algorithm, avoiding the errors from word segmentation.
Findings
Outperforms sequence-to-sequence baselines on real-world datasets
Predicts multiple prosodic levels simultaneously
Operates directly on Chinese characters in an end-to-end manner
Abstract
The accuracy of prosodic structure prediction is crucial to the naturalness of synthesized speech in Mandarin text-to-speech system, but now is limited by widely-used sequence-to-sequence framework and error accumulation from previous word segmentation results. In this paper, we propose a span-based Mandarin prosodic structure prediction model to obtain an optimal prosodic structure tree, which can be converted to corresponding prosodic label sequence. Instead of the prerequisite for word segmentation, rich linguistic features are provided by Chinese character-level BERT and sent to encoder with self-attention architecture. On top of this, span representation and label scoring are used to describe all possible prosodic structure trees, of which each tree has its corresponding score. To find the optimal tree with the highest score for a given sentence, a bottom-up CKY-style algorithm is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Topic Modeling
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · WordPiece · Weight Decay · Dense Connections · Attention Dropout · Multi-Head Attention · Linear Warmup With Linear Decay · Adam
