GPT-NER: Named Entity Recognition via Large Language Models
Shuhe Wang, Xiaofei Sun, Xiaoya Li, Rongbin Ouyang, Fei Wu, Tianwei, Zhang, Jiwei Li, Guoyin Wang

TL;DR
GPT-NER transforms NER into a text generation task using large language models, achieving comparable results to supervised methods and excelling in low-resource scenarios by addressing hallucination issues with self-verification.
Contribution
This paper introduces GPT-NER, a novel approach that reformulates NER as a generation task for LLMs, bridging the performance gap and enhancing low-resource NER capabilities.
Findings
GPT-NER achieves comparable performance to supervised models on standard datasets.
GPT-NER outperforms supervised models in low-resource and few-shot settings.
Self-verification reduces hallucination issues in LLM-based NER.
Abstract
Despite the fact that large-scale Language Models (LLM) have achieved SOTA performances on a variety of NLP tasks, its performance on NER is still significantly below supervised baselines. This is due to the gap between the two tasks the NER and LLMs: the former is a sequence labeling task in nature while the latter is a text-generation model. In this paper, we propose GPT-NER to resolve this issue. GPT-NER bridges the gap by transforming the sequence labeling task to a generation task that can be easily adapted by LLMs e.g., the task of finding location entities in the input text "Columbus is a city" is transformed to generate the text sequence "@@Columbus## is a city", where special tokens @@## marks the entity to extract. To efficiently address the "hallucination" issue of LLMs, where LLMs have a strong inclination to over-confidently label NULL inputs as entities, we propose a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Data Quality and Management
