RuDaS: Synthetic Datasets for Rule Learning and Evaluation Tools
Cristina Cornelio, Veronika Thost

TL;DR
This paper introduces RuDaS, a tool for generating synthetic datasets and evaluation tools to improve rule learning systems in knowledge graphs, addressing current limitations in dataset diversity and evaluation methods.
Contribution
It provides a novel dataset generation tool and evaluation framework for rule learning, filling gaps in existing datasets and assessment approaches.
Findings
Generated diverse datasets covering various rule dependencies
Proposed new performance measures for rule learning evaluation
Facilitated testing of scalability in rule learning systems
Abstract
Logical rules are a popular knowledge representation language in many domains, representing background knowledge and encoding information that can be derived from given facts in a compact form. However, rule formulation is a complex process that requires deep domain expertise,and is further challenged by today's often large, heterogeneous, and incomplete knowledge graphs. Several approaches for learning rules automatically, given a set of input example facts,have been proposed over time, including, more recently, neural systems. Yet, the area is missing adequate datasets and evaluation approaches: existing datasets often resemble toy examples that neither cover the various kinds of dependencies between rules nor allow for testing scalability. We present a tool for generating different kinds of datasets and for evaluating rule learning systems, including new performance measures.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Semantic Web and Ontologies
