Schema Generation for Large Knowledge Graphs Using Large Language Models
Bohui Zhang, Yuan He, Lydia Pintscher, Albert Mero\~no Pe\~nuela, Elena Simperl

TL;DR
This paper explores using large language models to automate schema generation for large knowledge graphs, introducing new datasets and evaluation metrics to assess quality and pushing the limits of LLMs in structured formalism generation.
Contribution
It presents a novel approach leveraging LLMs for schema creation, introduces datasets YAGO Schema and Wikidata EntitySchema, and establishes new benchmarks for structured schema generation.
Findings
LLMs can generate high-quality ShEx schemas for large KGs.
The proposed pipelines effectively utilize local and global KG information.
Benchmark results reveal the potential and limitations of LLMs in formal schema generation.
Abstract
Schemas play a vital role in ensuring data quality and supporting usability in the Semantic Web and natural language processing. Traditionally, their creation demands substantial involvement from knowledge engineers and domain experts. Leveraging the impressive capabilities of large language models (LLMs) in tasks like ontology engineering, we explore schema generation using LLMs. To bridge the resource gap, we introduce two datasets: YAGO Schema and Wikidata EntitySchema, along with novel evaluation metrics. The LLM-based pipelines utilize local and global information from knowledge graphs (KGs) to generate schemas in Shape Expressions (ShEx). Experiments demonstrate LLMs' strong potential in producing high-quality ShEx schemas, paving the way for scalable, automated schema generation for large KGs. Furthermore, our benchmark introduces a new challenge for structured generation,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Graph Neural Networks · Semantic Web and Ontologies · Topic Modeling
MethodsOntology
