Fighting Against the Repetitive Training and Sample Dependency Problem   in Few-shot Named Entity Recognition

Chang Tian; Wenpeng Yin; Dan Li; Marie-Francine Moens

arXiv:2406.05460·cs.CL·June 21, 2024

Fighting Against the Repetitive Training and Sample Dependency Problem in Few-shot Named Entity Recognition

Chang Tian, Wenpeng Yin, Dan Li, Marie-Francine Moens

PDF

Open Access

TL;DR

This paper proposes a novel few-shot NER pipeline that reduces repetitive training and sample dependency issues by pre-training a span detector on Wikipedia data and using LLMs for entity-type referents, achieving superior performance.

Contribution

It introduces a steppingstone span detector pre-trained on Wikipedia and leverages large language models to set entity-type referents, addressing key limitations in existing few-shot NER methods.

Findings

01

Outperforms baselines with fewer training steps and labeled data.

02

Achieves superior results in fine-grained few-shot NER, including surpassing ChatGPT.

03

Reduces repetitive training and sample dependency issues effectively.

Abstract

Few-shot named entity recognition (NER) systems recognize entities using a few labeled training examples. The general pipeline consists of a span detector to identify entity spans in text and an entity-type classifier to assign types to entities. Current span detectors rely on extensive manual labeling to guide training. Almost every span detector requires initial training on basic span features followed by adaptation to task-specific features. This process leads to repetitive training of the basic span features among span detectors. Additionally, metric-based entity-type classifiers, such as prototypical networks, typically employ a specific metric that gauges the distance between the query sample and entity-type referents, ultimately assigning the most probable entity type to the query sample. However, these classifiers encounter the sample dependency problem, primarily stemming from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning

MethodsSparse Evolutionary Training