OntoURL: A Benchmark for Evaluating Large Language Models on Symbolic Ontological Understanding, Reasoning and Learning
Xiao Zhang, Huiyuan Lai, Qianru Meng, Johan Bos

TL;DR
OntoURL is a comprehensive benchmark designed to evaluate large language models' abilities in understanding, reasoning, and learning with symbolic ontologies across multiple domains, revealing strengths and weaknesses in current models.
Contribution
This paper introduces OntoURL, the first systematic benchmark for assessing LLMs' ontological capabilities, filling a gap in evaluating structured symbolic knowledge processing.
Findings
LLMs understand ontologies well but struggle with reasoning and learning tasks.
Performance varies significantly across models, tasks, and domains.
Prompting strategies influence model performance in ontological tasks.
Abstract
Large language models have demonstrated remarkable capabilities across a wide range of tasks, yet their ability to process structured symbolic knowledge remains underexplored. To address this gap, we propose a taxonomy of ontological capabilities and introduce OntoURL, the first comprehensive benchmark designed to systematically evaluate LLMs' capabilities in handling ontologies -- formal and symbolic representations of domain knowledge. Based on the proposed taxonomy, OntoURL systematically assesses three dimensions: understanding, reasoning, and learning through 15 distinct tasks comprising 57,303 questions derived from 40 ontologies across 8 domains. Experiments with 20 open-source LLMs reveal significant performance differences across models, tasks, and domains, with current LLMs showing capabilities in understanding ontological knowledge but weaknesses in reasoning and learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Graph Neural Networks · Multimodal Machine Learning Applications
