Exploring and Adapting Chinese GPT to Pinyin Input Method
Minghuan Tan, Yong Dai, Duyu Tang, Zhangyin Feng, Guoping Huang, Jing, Jiang, Jiwei Li, Shuming Shi

TL;DR
This paper explores leveraging Chinese GPT for pinyin input methods, addressing challenges with abbreviated pinyin, and proposing strategies to improve performance, supported by a new dataset and comprehensive analysis.
Contribution
First exploration of Chinese GPT for pinyin input, introducing strategies to handle abbreviated pinyin and creating a large multi-domain dataset for evaluation.
Findings
Frozen GPT achieves state-of-the-art on perfect pinyin
Performance drops with abbreviated pinyin without strategies
Enriching context and training optimization improve results
Abstract
While GPT has become the de-facto method for text generation tasks, its application to pinyin input method remains unexplored. In this work, we make the first exploration to leverage Chinese GPT for pinyin input method. We find that a frozen GPT achieves state-of-the-art performance on perfect pinyin. However, the performance drops dramatically when the input includes abbreviated pinyin. A reason is that an abbreviated pinyin can be mapped to many perfect pinyin, which links to even larger number of Chinese characters. We mitigate this issue with two strategies, including enriching the context with pinyin and optimizing the training process to help distinguish homophones. To further facilitate the evaluation of pinyin input method, we create a dataset consisting of 270K instances from 15 domains. Results show that our approach improves performance on abbreviated pinyin across all…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Attention Dropout · Cosine Annealing · Linear Warmup With Cosine Annealing · Byte Pair Encoding · Residual Connection · Layer Normalization
