PANER: A Paraphrase-Augmented Framework for Low-Resource Named Entity Recognition
Nanda Kumar Rengarajan, Jun Yan, Chun Wang

TL;DR
This paper introduces PANER, a lightweight few-shot NER framework that combines instruction tuning and paraphrasing-based data augmentation to improve low-resource NER performance, achieving state-of-the-art results.
Contribution
The paper proposes a novel instruction tuning template and a paraphrasing data augmentation technique specifically designed for low-resource NER tasks.
Findings
Achieves an average F1 score of 80.1 on CrossNER datasets.
Paraphrasing improves F1 scores by up to 17 points.
Performs comparably to state-of-the-art models in few-shot and zero-shot settings.
Abstract
Named Entity Recognition (NER) is a critical task that requires substantial annotated data, making it challenging in low-resource scenarios where label acquisition is expensive. While zero-shot and instruction-tuned approaches have made progress, they often fail to generalize to domain-specific entities and do not effectively utilize limited available data. We present a lightweight few-shot NER framework that addresses these challenges through two key innovations: (1) a new instruction tuning template with a simplified output format that combines principles from prior IT approaches to leverage the large context window of recent state-of-the-art LLMs; (2) introducing a strategic data augmentation technique that preserves entity information while paraphrasing the surrounding context, thereby expanding our training data without compromising semantic relationships. Experiments on benchmark…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Graph Neural Networks
