Exploring and Adapting Chinese GPT to Pinyin Input Method

Minghuan Tan; Yong Dai; Duyu Tang; Zhangyin Feng; Guoping Huang; Jing; Jiang; Jiwei Li; Shuming Shi

arXiv:2203.00249·cs.CL·March 3, 2022

Exploring and Adapting Chinese GPT to Pinyin Input Method

Minghuan Tan, Yong Dai, Duyu Tang, Zhangyin Feng, Guoping Huang, Jing, Jiang, Jiwei Li, Shuming Shi

PDF

Open Access 1 Repo 1 Models

TL;DR

This paper explores leveraging Chinese GPT for pinyin input methods, addressing challenges with abbreviated pinyin, and proposing strategies to improve performance, supported by a new dataset and comprehensive analysis.

Contribution

First exploration of Chinese GPT for pinyin input, introducing strategies to handle abbreviated pinyin and creating a large multi-domain dataset for evaluation.

Findings

01

Frozen GPT achieves state-of-the-art on perfect pinyin

02

Performance drops with abbreviated pinyin without strategies

03

Enriching context and training optimization improve results

Abstract

While GPT has become the de-facto method for text generation tasks, its application to pinyin input method remains unexplored. In this work, we make the first exploration to leverage Chinese GPT for pinyin input method. We find that a frozen GPT achieves state-of-the-art performance on perfect pinyin. However, the performance drops dramatically when the input includes abbreviated pinyin. A reason is that an abbreviated pinyin can be mapped to many perfect pinyin, which links to even larger number of Chinese characters. We mitigate this issue with two strategies, including enriching the context with pinyin and optimizing the training process to help distinguish homophones. To further facilitate the evaluation of pinyin input method, we create a dataset consisting of 270K instances from 15 domains. Results show that our approach improves performance on abbreviated pinyin across all…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

VisualJoyce/Transformers4IME
pytorchOfficial

Models

🤗
aihijo/transformers4ime-pinyingpt-concat
model· 13 dl· ♡ 3
13 dl♡ 3

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Attention Dropout · Cosine Annealing · Linear Warmup With Cosine Annealing · Byte Pair Encoding · Residual Connection · Layer Normalization