Multi-task Transformer with Relation-attention and Type-attention for Named Entity Recognition
Ying Mo, Hongyin Tang, Jiahao Liu, Qifan Wang, Zenglin Xu, Jingang, Wang, Wei Wu, Zhoujun Li

TL;DR
This paper introduces a multi-task Transformer model for NER that improves boundary detection and type classification by integrating relation and type attention mechanisms, leveraging external knowledge, and unifying various NER task types.
Contribution
It proposes a novel multi-task Transformer with relation-attention and type-attention, enhancing generative NER models across flat, nested, and discontinuous datasets.
Findings
Significant performance improvements on multiple NER benchmarks.
Effective boundary detection via relation classification.
Enhanced entity-type mapping using external knowledge.
Abstract
Named entity recognition (NER) is an important research problem in natural language processing. There are three types of NER tasks, including flat, nested and discontinuous entity recognition. Most previous sequential labeling models are task-specific, while recent years have witnessed the rising of generative models due to the advantage of unifying all NER tasks into the seq2seq model framework. Although achieving promising performance, our pilot studies demonstrate that existing generative models are ineffective at detecting entity boundaries and estimating entity types. This paper proposes a multi-task Transformer, which incorporates an entity boundary detection task into the named entity recognition task. More concretely, we achieve entity boundary detection by classifying the relations between tokens within the sentence. To improve the accuracy of entity-type mapping during…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning
MethodsMulti-Head Attention · Attention Is All You Need · Tanh Activation · Sigmoid Activation · Linear Layer · Long Short-Term Memory · Dense Connections · Position-Wise Feed-Forward Layer · Adam · Softmax
