A Comparative Study of Pre-trained Encoders for Low-Resource Named   Entity Recognition

Yuxuan Chen; Jonas Mikkelsen; Arne Binder; Christoph Alt and; Leonhard Hennig

arXiv:2204.04980·cs.CL·April 12, 2022

A Comparative Study of Pre-trained Encoders for Low-Resource Named Entity Recognition

Yuxuan Chen, Jonas Mikkelsen, Arne Binder, Christoph Alt and, Leonhard Hennig

PDF

Open Access 1 Repo

TL;DR

This paper systematically compares various pre-trained encoders for low-resource named entity recognition, highlighting the importance of encoder selection due to significant performance variation across different models and strategies.

Contribution

It introduces an encoder evaluation framework and provides a comprehensive comparison of pre-trained representations for low-resource NER, considering multiple training strategies and architectures.

Findings

01

Encoder performance varies significantly across models.

02

Choice of encoder impacts low-resource NER effectiveness.

03

Evaluation across ten datasets demonstrates diverse results.

Abstract

Pre-trained language models (PLM) are effective components of few-shot named entity recognition (NER) approaches when augmented with continued pre-training on task-specific out-of-domain data or fine-tuning on in-domain data. However, their performance in low-resource scenarios, where such data is not available, remains an open question. We introduce an encoder evaluation framework, and use it to systematically compare the performance of state-of-the-art pre-trained representations on the task of low-resource NER. We analyze a wide range of encoders pre-trained with different strategies, model architectures, intermediate-task fine-tuning, and contrastive learning. Our experimental results across ten benchmark NER datasets in English and German show that encoder performance varies significantly, suggesting that the choice of encoder for a specific low-resource scenario needs to be…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dfki-nlp/fewie
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies