Evaluation of LLMs on Long-tail Entity Linking in Historical Documents
Marta Boscariol, Luana Bulla, Lia Draetta, Beatrice Fiuman\`o,, Emanuele Lenzi, Leonardo Piano

TL;DR
This paper evaluates the effectiveness of GPT and LLama3 LLMs in linking rare, long-tail entities in historical texts, demonstrating promising results that suggest LLMs can enhance long-tail entity linking performance.
Contribution
It provides the first systematic assessment of LLMs on long-tail entity linking in historical documents, comparing their performance with a state-of-the-art EL framework.
Findings
LLMs perform well in long-tail EL tasks
LLMs can complement traditional EL methods
Preliminary results show promising potential of LLMs for long-tail entity linking
Abstract
Entity Linking (EL) plays a crucial role in Natural Language Processing (NLP) applications, enabling the disambiguation of entity mentions by linking them to their corresponding entries in a reference knowledge base (KB). Thanks to their deep contextual understanding capabilities, LLMs offer a new perspective to tackle EL, promising better results than traditional methods. Despite the impressive generalization capabilities of LLMs, linking less popular, long-tail entities remains challenging as these entities are often underrepresented in training data and knowledge bases. Furthermore, the long-tail EL task is an understudied problem, and limited studies address it with LLMs. In the present work, we assess the performance of two popular LLMs, GPT and LLama3, in a long-tail entity linking scenario. Using MHERCL v0.1, a manually annotated benchmark of sentences from domain-specific…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Data Quality and Management
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Byte Pair Encoding · Attention Dropout · Softmax · Residual Connection · Linear Layer · Multi-Head Attention · Dense Connections · Discriminative Fine-Tuning · Adam
