Converse Attention Knowledge Transfer for Low-Resource Named Entity Recognition
Shengfei Lyu, Linghao Sun, Huixiong Yi, Yong Liu, Huanhuan Chen,, Chunyan Miao

TL;DR
This paper introduces Converse Attention Network (CAN), a novel method that leverages high-resource English models and attention-based translation to improve NER performance in low-resource languages.
Contribution
The paper proposes CAN, a new approach that aligns low-resource language features with high-resource English models using attention matrices for better NER results.
Findings
CAN achieves significant improvements on four low-resource NER datasets.
The method effectively aligns semantic features across languages.
Results demonstrate the potential of attention-based transfer for low-resource NLP tasks.
Abstract
In recent years, great success has been achieved in many tasks of natural language processing (NLP), e.g., named entity recognition (NER), especially in the high-resource language, i.e., English, thanks in part to the considerable amount of labeled resources. However, most low-resource languages do not have such an abundance of labeled data as high-resource English, leading to poor performance of NER in these low-resource languages. Inspired by knowledge transfer, we propose Converse Attention Network, or CAN in short, to improve the performance of NER in low-resource languages by leveraging the knowledge learned in pretrained high-resource English models. CAN first translates low-resource languages into high-resource English using an attention based translation module. In the process of translation, CAN obtain the attention matrices that align the two languages. Furthermore, CAN use…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
