Enhancing Pre-trained Chinese Character Representation with Word-aligned   Attention

Yanzeng Li; Bowen Yu; Mengge Xue; Tingwen Liu

arXiv:1911.02821·cs.CL·April 30, 2020·1 cites

Enhancing Pre-trained Chinese Character Representation with Word-aligned Attention

Yanzeng Li, Bowen Yu, Mengge Xue, Tingwen Liu

PDF

Open Access 1 Repo

TL;DR

This paper introduces a word-aligned attention mechanism to incorporate explicit word information into Chinese pre-trained models, improving their understanding by aligning character and word representations.

Contribution

It proposes a novel pooling-based word-aligned attention method that enhances Chinese pre-trained models by explicitly integrating word-level semantics.

Findings

01

Significant performance improvements on five Chinese NLP benchmarks.

02

Effective mitigation of segmentation error propagation.

03

Enhanced integration of word and character information.

Abstract

Most Chinese pre-trained models take character as the basic unit and learn representation according to character's external contexts, ignoring the semantics expressed in the word, which is the smallest meaningful utterance in Chinese. Hence, we propose a novel word-aligned attention to exploit explicit word information, which is complementary to various character-based Chinese pre-trained language models. Specifically, we devise a pooling mechanism to align the character-level attention to the word level and propose to alleviate the potential issue of segmentation error propagation by multi-source information fusion. As a result, word and character information are explicitly integrated at the fine-tuning procedure. Experimental results on five Chinese NLP benchmark tasks demonstrate that our model could bring another significant gain over several pre-trained models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lsvih/MWA
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods