OAEI-LLM: A Benchmark Dataset for Understanding Large Language Model Hallucinations in Ontology Matching

Zhangcheng Qiang; Kerry Taylor; Weiqing Wang; Jing Jiang

arXiv:2409.14038·cs.AI·January 30, 2026

OAEI-LLM: A Benchmark Dataset for Understanding Large Language Model Hallucinations in Ontology Matching

Zhangcheng Qiang, Kerry Taylor, Weiqing Wang, Jing Jiang

PDF

Open Access

TL;DR

This paper introduces OAEI-LLM, a benchmark dataset designed to evaluate and understand hallucinations of large language models in ontology matching tasks, addressing a critical need for specialized evaluation tools.

Contribution

The paper presents the creation of the OAEI-LLM dataset, an extended benchmark for assessing LLM hallucinations in ontology matching, including methodology and potential applications.

Findings

01

OAEI-LLM effectively captures LLM hallucinations in OM tasks.

02

The dataset enables systematic evaluation of LLM reliability in ontology matching.

03

Potential use cases for improving LLM-based ontology matching are demonstrated.

Abstract

Hallucinations of large language models (LLMs) commonly occur in domain-specific downstream tasks, with no exception in ontology matching (OM). The prevalence of using LLMs for OM raises the need for benchmarks to better understand LLM hallucinations. The OAEI-LLM dataset is an extended version of the Ontology Alignment Evaluation Initiative (OAEI) datasets that evaluate LLM-specific hallucinations in OM tasks. We outline the methodology used in dataset construction and schema extension, and provide examples of potential use cases.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Advanced Graph Neural Networks · Complex Network Analysis Techniques

MethodsOntology