# SynEL: A synthetic benchmark for entity linking

**Authors:** Ilia Karpov, Alexander Kirillovich, Elisaveta Goncharova, Andrey Parinov, Alexander Chernyavskiy, Dmitry Ilvovsky, Natalia Semenova, Artyom Sosedka, Ekaterina Lisitsyna, Mikhail Belkin

PMC · DOI: 10.1371/journal.pone.0339468 · PLOS One · 2026-01-08

## TL;DR

This paper introduces SynEL, a new benchmark for evaluating knowledge extraction methods, especially in low-resource domains like customer support dialogues.

## Contribution

The paper introduces SynEL, a synthetic benchmark for evaluating entity linking in low-resource domains.

## Key findings

- LLMs show a 25-point drop in micro-F1 scores for low-resource entity extraction.
- Training with synthetic datasets improves micro-F1 scores by up to 10 points.
- The benchmark and code are publicly released for model evaluation and fine-tuning.

## Abstract

Large language models (LLMs) offer significant potential for constructing commonsense knowledge graphs from text, demonstrating adaptability across diverse domains. However, their effectiveness varies significantly with domain-specific language, highlighting a critical need for specialized benchmarks to assess and optimize knowledge graph construction sub-tasks like named entity recognition, relation extraction, and entity linking. Currently, domain-specific benchmarks are scarce. To address this gap, we introduce SynEL, a novel benchmark developed for evaluating text-based knowledge extraction methods, validated using customer support dialogues. We present a comprehensive methodology for benchmark construction, propose two distinct approaches for generating synthetic datasets, and evaluate accumulated hallucinations. Our experiments reveal that existing LLMs experience a significant performance drop, with micro-F1 scores decreasing by up to 25 absolute points when extracting low-resource entities compared to high-resource entities from sources like Wikipedia. Furthermore, by incorporating synthetic datasets into the training process, we achieved an improvement in micro-F1 scores of up to 10 absolute points. We publicly release our benchmark and generation code to demonstrate its utility for fine-tuning and evaluating LLMs.

## Full-text entities

- **Diseases:** hallucinations (MESH:D006212)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12782364/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12782364/full.md

## References

51 references — full list in the complete paper: https://tomesphere.com/paper/PMC12782364/full.md

---
Source: https://tomesphere.com/paper/PMC12782364