Transformer-GCRF: Recovering Chinese Dropped Pronouns with General Conditional Random Fields
Jingxuan Yang, Kerui Xu, Jun Xu, Si Li, Sheng Gao, Jun Guo, Ji-Rong, Wen, Nianwen Xue

TL;DR
This paper introduces Transformer-GCRF, a novel model combining Transformer networks with General Conditional Random Fields to improve the recovery of dropped pronouns in Chinese conversations by modeling inter-utterance dependencies.
Contribution
The paper proposes a new framework that integrates Transformer and GCRF to better model pronoun dependencies, outperforming existing methods in Chinese dropped pronoun recovery.
Findings
Transformer-GCRF outperforms state-of-the-art models.
GCRF effectively captures inter-utterance pronoun dependencies.
Model improves pronoun recovery accuracy on Chinese datasets.
Abstract
Pronouns are often dropped in Chinese conversations and recovering the dropped pronouns is important for NLP applications such as Machine Translation. Existing approaches usually formulate this as a sequence labeling task of predicting whether there is a dropped pronoun before each token and its type. Each utterance is considered to be a sequence and labeled independently. Although these approaches have shown promise, labeling each utterance independently ignores the dependencies between pronouns in neighboring utterances. Modeling these dependencies is critical to improving the performance of dropped pronoun recovery. In this paper, we present a novel framework that combines the strength of Transformer network with General Conditional Random Fields (GCRF) to model the dependencies between pronouns in neighboring utterances. Results on three Chinese conversation datasets show that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Dense Connections · Layer Normalization · Byte Pair Encoding · Multi-Head Attention · Dropout · Label Smoothing · Attention Is All You Need
