Local LLM Ensembles for Zero-shot Portuguese Named Entity Recognition

Jo\~ao Lucas Luz Lima Sarcinelli; Diego Furtado Silva

arXiv:2512.10043·cs.LG·December 12, 2025

Local LLM Ensembles for Zero-shot Portuguese Named Entity Recognition

Jo\~ao Lucas Luz Lima Sarcinelli, Diego Furtado Silva

PDF

Open Access

TL;DR

This paper introduces a novel ensemble approach for zero-shot Portuguese NER using local LLMs, outperforming individual models and reducing the need for annotated data.

Contribution

It proposes a three-step ensemble pipeline for zero-shot NER with local LLMs, a novel heuristic for model selection, and demonstrates cross-dataset generalization.

Findings

01

Ensembles outperform individual LLMs in 4 out of 5 datasets.

02

Ensembles on different source datasets outperform single models.

03

Method reduces reliance on annotated data for NER.

Abstract

Large Language Models (LLMs) excel in many Natural Language Processing (NLP) tasks through in-context learning but often under-perform in Named Entity Recognition (NER), especially for lower-resource languages like Portuguese. While open-weight LLMs enable local deployment, no single model dominates all tasks, motivating ensemble approaches. However, existing LLM ensembles focus on text generation or classification, leaving NER under-explored. In this context, this work proposes a novel three-step ensemble pipeline for zero-shot NER using similarly capable, locally run LLMs. Our method outperforms individual LLMs in four out of five Portuguese NER datasets by leveraging a heuristic to select optimal model combinations with minimal annotated data. Moreover, we show that ensembles obtained on different source datasets generally outperform individual LLMs in cross-dataset configurations,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification