VSPO: Validating Semantic Pitfalls in Ontology via LLM-Based CQ Generation
Hyojun Choi, Seokju Hwang, Kyong-Ho Lee

TL;DR
This paper introduces VSPO, a dataset and model leveraging LLMs to generate competency questions that detect semantic pitfalls in ontologies, improving validation accuracy and reducing manual effort.
Contribution
The study presents the first LLM-based approach specifically designed to validate semantic pitfalls in ontology competency questions, enhancing detection of modeling errors.
Findings
Model achieves 26% higher precision than GPT-4.1
Model achieves 28.2% higher recall than GPT-4.1
Generates broader range of error-detecting CQs
Abstract
Competency Questions (CQs) play a crucial role in validating ontology design. While manually crafting CQs can be highly time-consuming and costly for ontology engineers, recent studies have explored the use of large language models (LLMs) to automate this process. However, prior approaches have largely evaluated generated CQs based on their similarity to existing datasets, which often fail to verify semantic pitfalls such as "Misusing allValuesFrom". Since such pitfalls cannot be reliably detected through rule-based methods, we propose a novel dataset and model of Validating Semantic Pitfalls in Ontology (VSPO) for CQ generation specifically designed to verify the semantic pitfalls. To simulate missing and misused axioms, we use LLMs to generate natural language definitions of classes and properties and introduce misalignments between the definitions and the ontology by removing axioms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSemantic Web and Ontologies · Topic Modeling · Advanced Graph Neural Networks
