TL;DR
This paper investigates the use of cross-sentence context with BERT models to improve named entity recognition performance across multiple languages, introducing a simple voting method to enhance results without modifying BERT architecture.
Contribution
It systematically studies cross-sentence context for NER with BERT and proposes Contextual Majority Voting to boost accuracy across five languages.
Findings
Adding cross-sentence context improves NER performance in all tested languages.
The proposed CMV method further increases accuracy without changing BERT architecture.
Achieved state-of-the-art results on several NER benchmarks.
Abstract
Named entity recognition (NER) is frequently addressed as a sequence classification task where each input consists of one sentence of text. It is nevertheless clear that useful information for the task can often be found outside of the scope of a single-sentence context. Recently proposed self-attention models such as BERT can both efficiently capture long-distance relationships in input as well as represent inputs consisting of several sentences, creating new opportunitites for approaches that incorporate cross-sentence information in natural language processing tasks. In this paper, we present a systematic study exploring the use of cross-sentence information for NER using BERT models in five languages. We find that adding context in the form of additional sentences to BERT input systematically increases NER performance on all of the tested languages and models. Including multiple…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Weight Decay · Softmax · Adam · Multi-Head Attention · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Dropout · Linear Warmup With Linear Decay · Dense Connections
