Efficiently Learning Probabilistic Logical Models by Cheaply Ranking Mined Rules

Jonathan Feldstein; Dominic Phillips; Efthymia Tsamoura

arXiv:2409.16238·cs.AI·October 7, 2025

Efficiently Learning Probabilistic Logical Models by Cheaply Ranking Mined Rules

Jonathan Feldstein, Dominic Phillips, Efthymia Tsamoura

PDF

Open Access 3 Reviews

TL;DR

This paper presents SPECTRUM, a scalable framework for efficiently learning logical theories from relational data by mining and ranking rules based on a new utility measure, significantly reducing computational costs.

Contribution

The paper introduces a novel utility measure for logical rules and a linear-time algorithm for mining and ranking rules, enabling scalable learning of logical theories from large datasets.

Findings

01

SPECTRUM scales to larger datasets with linear-time rule mining.

02

It learns more accurate logical theories on CPUs in less than 1% of the runtime of neural network approaches.

03

Theoretical guarantees are provided on the utility of the learned theories.

Abstract

Probabilistic logical models are a core component of neurosymbolic AI and are important in their own right for tasks that require high explainability. Unlike neural networks, logical theories that underlie the model are often handcrafted using domain expertise, making their development costly and prone to errors. While there are algorithms that learn logical theories from data, they are generally prohibitively expensive, limiting their applicability in real-world settings. Here, we introduce precision and recall for logical rules and define their composition as rule utility - a cost-effective measure of the predictive power of logical theories. We also introduce SPECTRUM, a scalable framework for learning logical theories from relational data. Its scalability derives from a linear-time algorithm for mining recurrent subgraphs in the data graph along with a second algorithm that, using a…

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 3Confidence 4

Strengths

- The problem of scalable rule learning is highly important - SPECTRUM does appear to be more efficient and scalable as compared to baselines. - The authors formally prove a number of properties of SPECTRUM including completeness and approximate optimality of the pattern mining phase

Weaknesses

The paper is missing key related work and baselines: 1. Poole et al., (2014) diagnose the scalability problems of MLNs. 2. Gradient boosted Relational Dependency Networks (Natarajan et al., 2012) perform joint structure and parameter learning and have been shown to outperform MLNs 3. B-RLR (Ramanan et al., 2021) learns Relational Logistic Regression models (Kazemi et al., 2014) using Functional Gradient boosting 4. NNRPT (Kaur et al., 2020) uses relational random walks to instantiate neural ne

Reviewer 02Rating 6Confidence 2

Strengths

1. The structure of the paper is easy to understand; 2. The authors proposed well-designed metrics to evaluate the symbolic rule. 3. The proposed ILP algorithms are analyzed in terms of complexity and completeness. In addition, the experiments are conducted explicitly.

Weaknesses

1. Some definitions are not clear to present. Please see the question 1. 2. For the scalability of the proposed ILP model, there are no results to indicate SPECTRUM can extract rules from very large knowledge graphs such as UMLS, Kinship, FB15K-237, etc. However, the NeuralLP (Yang et al., 2017) and DRUM (Sadeghian et al., 2019) did learn symbolic rules from these large datasets.

Reviewer 03Rating 6Confidence 3

Strengths

1. The paper has a well-defined motivation for tackling scalability issues in probabilistic logic models, with concise and clear explanations. 2. The authors provide theoretical guarantees on the computational cost required for a certain error bound on the utility estimates. 3. Experimental results show that SPECTRUM significantly outperforms previous methods in both efficiency and predictive accuracy, highlighting its practical advantages

Weaknesses

1. The SPECTRUM framework itself does not learn probabilistic rules, and the emphasis on "probabilistic" throughout, including in the title, seems somewhat overstated. 2. The authors' introduction of Inductive Logic Programming and Differentiable Rule Learning in the related work section appears insufficient. Some advanced differentiable rule learning algorithms have less reliance on templates and offer better scalability compared to traditional methods[1]; a comparison with these should be inc

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference · Semantic Web and Ontologies · Data Mining Algorithms and Applications