PICLe: Pseudo-Annotations for In-Context Learning in Low-Resource Named   Entity Detection

Sepideh Mamooler; Syrielle Montariol; Alexander Mathis; Antoine; Bosselut

arXiv:2412.11923·cs.CL·April 2, 2025

PICLe: Pseudo-Annotations for In-Context Learning in Low-Resource Named Entity Detection

Sepideh Mamooler, Syrielle Montariol, Alexander Mathis, Antoine, Bosselut

PDF

Open Access 1 Repo 1 Datasets 1 Video

TL;DR

This paper introduces PICLe, a novel framework that uses pseudo-annotations generated by LLMs to improve low-resource Named Entity Detection, reducing reliance on human-labeled data and enhancing in-context learning performance.

Contribution

The paper proposes PICLe, a method leveraging pseudo-annotated demonstrations and clustering to enhance in-context learning for low-resource NED tasks, demonstrating effectiveness without human annotations.

Findings

01

PICLe outperforms standard ICL in low-resource biomedical NED datasets.

02

Pseudo-annotations generated by LLMs are as effective as fully correct demonstrations.

03

Clustering and self-verification improve the selection of in-context demonstrations.

Abstract

In-context learning (ICL) enables Large Language Models (LLMs) to perform tasks using few demonstrations, facilitating task adaptation when labeled examples are hard to obtain. However, ICL is sensitive to the choice of demonstrations, and it remains unclear which demonstration attributes enable in-context generalization. In this work, we conduct a perturbation study of in-context demonstrations for low-resource Named Entity Detection (NED). Our surprising finding is that in-context demonstrations with partially correct annotated entity mentions can be as effective for task transfer as fully correct demonstrations. Based off our findings, we propose Pseudo-annotated In-Context Learning (PICLe), a framework for in-context learning with noisy, pseudo-annotated demonstrations. PICLe leverages LLMs to annotate many demonstrations in a zero-shot first pass. We then cluster these synthetic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sMamooler/PICLe
noneOfficial

Datasets

smamooler/PICLe
dataset· 8 dl
8 dl

Videos

PICLe: Pseudo-annotations for In-Context Learning in Low-Resource Named Entity Detection· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Web Data Mining and Analysis

MethodsSparse Evolutionary Training