Enhancing Pre-trained Chinese Character Representation with Word-aligned Attention
Yanzeng Li, Bowen Yu, Mengge Xue, Tingwen Liu

TL;DR
This paper introduces a word-aligned attention mechanism to incorporate explicit word information into Chinese pre-trained models, improving their understanding by aligning character and word representations.
Contribution
It proposes a novel pooling-based word-aligned attention method that enhances Chinese pre-trained models by explicitly integrating word-level semantics.
Findings
Significant performance improvements on five Chinese NLP benchmarks.
Effective mitigation of segmentation error propagation.
Enhanced integration of word and character information.
Abstract
Most Chinese pre-trained models take character as the basic unit and learn representation according to character's external contexts, ignoring the semantics expressed in the word, which is the smallest meaningful utterance in Chinese. Hence, we propose a novel word-aligned attention to exploit explicit word information, which is complementary to various character-based Chinese pre-trained language models. Specifically, we devise a pooling mechanism to align the character-level attention to the word level and propose to alleviate the potential issue of segmentation error propagation by multi-source information fusion. As a result, word and character information are explicitly integrated at the fine-tuning procedure. Experimental results on five Chinese NLP benchmark tasks demonstrate that our model could bring another significant gain over several pre-trained models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods
