Local LLM Ensembles for Zero-shot Portuguese Named Entity Recognition
Jo\~ao Lucas Luz Lima Sarcinelli, Diego Furtado Silva

TL;DR
This paper introduces a novel ensemble approach for zero-shot Portuguese NER using local LLMs, outperforming individual models and reducing the need for annotated data.
Contribution
It proposes a three-step ensemble pipeline for zero-shot NER with local LLMs, a novel heuristic for model selection, and demonstrates cross-dataset generalization.
Findings
Ensembles outperform individual LLMs in 4 out of 5 datasets.
Ensembles on different source datasets outperform single models.
Method reduces reliance on annotated data for NER.
Abstract
Large Language Models (LLMs) excel in many Natural Language Processing (NLP) tasks through in-context learning but often under-perform in Named Entity Recognition (NER), especially for lower-resource languages like Portuguese. While open-weight LLMs enable local deployment, no single model dominates all tasks, motivating ensemble approaches. However, existing LLM ensembles focus on text generation or classification, leaving NER under-explored. In this context, this work proposes a novel three-step ensemble pipeline for zero-shot NER using similarly capable, locally run LLMs. Our method outperforms individual LLMs in four out of five Portuguese NER datasets by leveraging a heuristic to select optimal model combinations with minimal annotated data. Moreover, we show that ensembles obtained on different source datasets generally outperform individual LLMs in cross-dataset configurations,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
