Deep Span Representations for Named Entity Recognition

Enwei Zhu; Yiyang Liu; Jinpeng Li

arXiv:2210.04182·cs.CL·May 10, 2023

Deep Span Representations for Named Entity Recognition

Enwei Zhu, Yiyang Liu, Jinpeng Li

PDF

Open Access 1 Repo

TL;DR

This paper introduces DSpERT, a deep span encoder that enhances span representations for NER, especially for long and nested entities, by stacking transformer layers to improve semantic depth and separation.

Contribution

The paper proposes DSpERT, a novel span transformer architecture that produces deep semantic span representations, outperforming existing shallow models in NER tasks.

Findings

01

DSpERT achieves state-of-the-art or competitive results on eight NER benchmarks.

02

Deep span representations improve performance on long-span and nested entities.

03

Deep span features are well-structured and easily separable.

Abstract

Span-based models are one of the most straightforward methods for named entity recognition (NER). Existing span-based NER systems shallowly aggregate the token representations to span representations. However, this typically results in significant ineffectiveness for long-span entities, a coupling between the representations of overlapping spans, and ultimately a performance degradation. In this study, we propose DSpERT (Deep Span Encoder Representations from Transformers), which comprises a standard Transformer and a span Transformer. The latter uses low-layered span representations as queries, and aggregates the token representations as keys and values, layer by layer from bottom to top. Thus, DSpERT produces span representations of deep semantics. With weight initialization from pretrained language models, DSpERT achieves performance higher than or competitive with recent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

syuoni/eznlp
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Data Quality and Management

MethodsAttention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Residual Connection · Dropout · Softmax · Label Smoothing · Multi-Head Attention · Adam · Dense Connections