It's All About the Confidence: An Unsupervised Approach for Multilingual Historical Entity Linking using Large Language Models

Cristian Santini; Marieke Van Erp; Mehwish Alam

arXiv:2601.08500·cs.CL·January 14, 2026

It's All About the Confidence: An Unsupervised Approach for Multilingual Historical Entity Linking using Large Language Models

Cristian Santini, Marieke Van Erp, Mehwish Alam

PDF

Open Access 1 Video

TL;DR

This paper introduces MHEL-LLaMo, an unsupervised multilingual approach for historical entity linking that combines a small language model and a large language model to improve accuracy and efficiency without fine-tuning.

Contribution

The paper presents a novel ensemble method that uses confidence scores to selectively apply large language models, reducing costs and hallucinations in historical entity linking.

Findings

01

Outperforms state-of-the-art models on multiple benchmarks

02

Works effectively across six European languages from the 19th and 20th centuries

03

Does not require fine-tuning, enabling scalable low-resource historical EL

Abstract

Despite the recent advancements in NLP with the advent of Large Language Models (LLMs), Entity Linking (EL) for historical texts remains challenging due to linguistic variation, noisy inputs, and evolving semantic conventions. Existing solutions either require substantial training data or rely on domain-specific rules that limit scalability. In this paper, we present MHEL-LLaMo (Multilingual Historical Entity Linking with Large Language MOdels), an unsupervised ensemble approach combining a Small Language Model (SLM) and an LLM. MHEL-LLaMo leverages a multilingual bi-encoder (BELA) for candidate retrieval and an instruction-tuned LLM for NIL prediction and candidate selection via prompt chaining. Our system uses SLM's confidence scores to discriminate between easy and hard samples, applying an LLM only for hard cases. This strategy reduces computational costs while preventing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

It's All About the Confidence: An Unsupervised Approach for Multilingual Historical Entity Linking using Large Language Models· underline

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods · Sentiment Analysis and Opinion Mining