Converse Attention Knowledge Transfer for Low-Resource Named Entity   Recognition

Shengfei Lyu; Linghao Sun; Huixiong Yi; Yong Liu; Huanhuan Chen,; Chunyan Miao

arXiv:1906.01183·cs.CL·January 10, 2023·1 cites

Converse Attention Knowledge Transfer for Low-Resource Named Entity Recognition

Shengfei Lyu, Linghao Sun, Huixiong Yi, Yong Liu, Huanhuan Chen,, Chunyan Miao

PDF

Open Access

TL;DR

This paper introduces Converse Attention Network (CAN), a novel method that leverages high-resource English models and attention-based translation to improve NER performance in low-resource languages.

Contribution

The paper proposes CAN, a new approach that aligns low-resource language features with high-resource English models using attention matrices for better NER results.

Findings

01

CAN achieves significant improvements on four low-resource NER datasets.

02

The method effectively aligns semantic features across languages.

03

Results demonstrate the potential of attention-based transfer for low-resource NLP tasks.

Abstract

In recent years, great success has been achieved in many tasks of natural language processing (NLP), e.g., named entity recognition (NER), especially in the high-resource language, i.e., English, thanks in part to the considerable amount of labeled resources. However, most low-resource languages do not have such an abundance of labeled data as high-resource English, leading to poor performance of NER in these low-resource languages. Inspired by knowledge transfer, we propose Converse Attention Network, or CAN in short, to improve the performance of NER in low-resource languages by leveraging the knowledge learned in pretrained high-resource English models. CAN first translates low-resource languages into high-resource English using an attention based translation module. In the process of translation, CAN obtain the attention matrices that align the two languages. Furthermore, CAN use…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications