Efficient predicate invention using shared "NeMuS"
Edjard Mota, Jacob M. Howe, Ana Schramm, Artur d'Avila Garcez

TL;DR
This paper introduces Amao, a cognitive agent framework that employs a Neural Multi-Space graph structure to efficiently invent predicates and support inductive learning, including recursive hypotheses, within a constrained logical language.
Contribution
It presents a novel predicate invention method using shared NeMuS graphs that enhances inductive logic programming with neural weights and supports recursive hypothesis learning.
Findings
Supports recursive hypothesis learning.
Uses neural weights to guide predicate invention.
Restricts hypothesis shape with language biases.
Abstract
Amao is a cognitive agent framework that tackles the invention of predicates with a different strategy as compared to recent advances in Inductive Logic Programming (ILP) approaches like Meta-Intepretive Learning (MIL) technique. It uses a Neural Multi-Space (NeMuS) graph structure to anti-unify atoms from the Herbrand base, which passes in the inductive momentum check. Inductive Clause Learning (ICL), as it is called, is extended here by using the weights of logical components, already present in NeMuS, to support inductive learning by expanding clause candidates with anti-unified atoms. An efficient invention mechanism is achieved, including the learning of recursive hypotheses, while restricting the shape of the hypothesis by adding bias definitions or idiosyncrasies of the language.
| Step | partial hypothesis | |
|---|---|---|
| 1 | n/a | |
| 2 | ||
| = consistent | ||
| 3 | ||
| = consistent | ||
| = inconsistent |
| atom/hypothesis | / | |
| 0 | ||
| 1 | ||
| no hook | ||
| \hdashlinebias | match both | |
| for | rename variable | |
| \hdashline | ||
| \hdashlinebias | match both | |
| for | rename variable | |
| \hdashline | ||
| reaches maximun body size, | do not add another | |
| . Check for | region similarity | |
| \hdashline | ||
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLogic, Reasoning, and Knowledge · Machine Learning and Algorithms · AI-based Problem Solving and Planning
Efficient Predicate Invention using Shared NeMuS
Edjard Mota1
Jacob M. Howe2
Ana Schramm1
Artur d’Avila Garcez2
1Institute of Computing, Federal University of Amazonas, Manaus - Brazil
2Department of Computer Science, City, University of London, UK
{edjard, acms}@icomp.ufam.edu.br, {j.m.howe, A.GARCEZ}@city.ac.uk
Abstract
Amao is a cognitive agent framework that tackles the invention of predicates with a different strat- egy as compared to recent advances in Inductive Logic Programming (ILP) approaches like Meta- Intepretive Learning (MIL) technique. It uses a Neural Multi-Space (NeMuS) graph structure to anti-unify atoms from the Herbrand base, which passes in the inductive momentum check. Induc- tive Clause Learning (ICL), as it is called, is ex- tended here by using the weights of logical compo- nents, already present in NeMuS, to support induc- tive learning by expanding clause candidates with anti-unified atoms. An efficient invention mecha- nism is achieved, including the learning of recur- sive hypotheses, while restricting the shape of the hypothesis by adding bias definitions or idiosyn- crasies of the language.
1 Introduction
One of the key challenges in Inductive Logic Programming (ILP) is finding good heuristics to search the hypothesis space. In Standard ILP, a good heuristic is one that can arrive quickly at a hypothesis that is both successful and succinct. To achieve this, the efficiency of hypothesis generation depends on the partial or even total order over the Herbrand Base to constrain deduction operations. This work presents a new approach called Inductive Clause Learning (ICL) building on Mota and Diniz (2016), which introduces a data structure named Neural Multi-Space (NeMuS) used in Amao, a neural-symbolic reasoning platform that performs symbolic reasoning on structured clauses via Linear Resolution Robinson (1965).
Inspired by Boyer and Moore (1972), NeMuS is a shared multi-space representation for a portion of first-order logic designed for use with machine learning and neural network methods. Such a structure contains weightings on individual elements (atoms, predicates, functions and constants or variables) to help guide the use of these elements in tasks such as theorem proving, as well as in using them to guide the search in the hypothesis space and improve the efficiency and success of inductive learning. Although it has some similarities to ILP, the hypotheses search mechanism is fundamentally different. It uses the Herbrand Base (HB) to build up hypothesis candidates using inverse unification (adapted from Idestam-Almquist (1993)), and prunes away meaningless hypotheses as a result of inductive momentum between predicates connected to positive and negative examples. In Mota et al. (2017) inverse or anti-unification Idestam-Almquist (1993) was added to allow induction of general rules from ground clauses, which is supported by the idea of regions of concepts. However, the inductive learning algorithm presented did not consider an adequate representation and use of bias, and the invention of predicates called predicate invention. Here we show how this can be achieved without using meta-interpreter level of bias specification or reasoning. It is important to note that weights are still not automatically used, but are taken into account when apparently unconnected literals have a common predicate name.
This paper makes the following contributions: it demonstrates that it is possible to have predicate invention without the use of meta-rules, and consequently, of Meta-Interpretive Learning. It shows that the NeMuS structure can be used for this purpose without generating numerous meaningless hypotheses, as the invention is made during Inductive Clause Learning. For that, we use bias or automated predicate generation. Finally, it demonstrates how invention of recursive rules takes advantage of weights of the logical component representation within NeMuS.
The remainder of this paper is structured as follows: section 2 gives some brief background on inductive logic programming and the Shared NeMuS data structure, sections 3 and 4 describe the implementation of inductive learning in Amao using the Shared NeMuS data structure, then section 5 describes some related work and section 6 discusses the work presented.
2 Background
2.1 Inductive Logic Programming (ILP)
The Inductive Logic Programming (ILP) main challenge, as defined in Muggleton (1991), is to search for a logical description (a hypothesis, ) of a target concept, based on set of (positive and negative) examples along with a set called background knowledge (). The central idea is that is a consistent hypothesis, i.e. plus the hypothesis entails the positive examples (), whilst does not entail the negative ones (). Formally, and .
Typical ILP systems implement search strategies over the space of all possible hypotheses. To reduce search complexity, such mechanisms rely on partial order of -subsumption Nienhuys-Cheng and De Wolf (1997), or on a total ordering over the Herbrand Base to constrain deductive, abductive and inductive operations Muggleton et al. (2015). As a side-effect, the space of hypotheses grows even more due to quantification over meta-rules by the meta-interpretive process.
The Inductive Clause Learning (ICL) technique is fundamentally different, while its learning results are similar to ILP and MIL. It does not generate hypotheses to then test whether entails positive examples but not the negative ones. Instead, ICL anticipates the elimination of inconsistent hypotheses at each induction step by colliding atoms obtained from a search across bindings of constants from and . This is possible because NeMuS is a network of shared spaces (for constant terms, predicates and clauses), interconnected through weighted bindings pointing to the target space in which an occurrence of an element appears. In what follows this is briefly described.
2.2 Shared NeMuS
NeMuS is an ordered space for components of a first-order language: variables (space 0), atomic constants of the Herbrand Universe (space 1), functions (space 2, suppressed here), predicates with literal instances (space 3), and clauses (space 4), and so on. In what follows vectors are written , and or is used to refer to an element of a vector at position .
Each logical element is described by a vector called T-Node, and in particular each element is uniquely identified by an integer code (an index) within its space. In addition, a T-Node identifies the lexicographic occurrence of the element, and (when appropriate) an attribute position.
Definition 1** (T-Node)**
Let , and . A T-Node (target node) is a quadruple that identifies an object at space , with code and occurrence , at attribute position (when it applies, otherwise 1). is the set of all T-Nodes. For a vector , of T-Nodes, with size , and is a code occurring in an element of , then is the index of within the T-Node, .
As T-Nodes are the building block of our approach, all other elements follows from it and we describe as follows.
NeMuS Binding
is an indexed pair in which , and , such that , , and . It represents the importance of object over occurrence of object at space in position .
Variable Space (0)
is a vector , in which each is a vector of bindings. The elements of the variable space represent all of the occurrences of a variables. The logical scope of a variable is identified by the instances of its bindings.
Constant Space (1)
is a vector , in which every is a vector of bindings. The function maps a constant to the vector of its bindings , as above.
Compounds (functions, predicates and clauses), are in higher spaces. Their logical components are formed by a vector of T-Nodes (one for each argument), and a vector of NeMuS bindings (simply bindings) to represent their instances.
Compound
in NeMuS is a vector of T-Nodes, i.e. , so that each , and it represents an attribute of a compound logical expression coded as .
Instance Space
(I-Space) of a compound is the pair in which is a vector of bindings. A vector of I-Spaces is a NeMuS Compound Space (C-Space).
A literal (predicate instance), is an element of an I-Space, and so the predicate space is simply a C-Space. Seen as compounds, clauses’ attributes are the literals composing such clauses.
Predicate Space (3)
is a pair in which and are vectors of C-spaces.
Clause Space (4)
is a vector of C-spaces such that every pair in the vector shall be .
Note that the order of each space is only defined when they are gathered in the following structure.
Definition 2** (Shared NeMuS)**
A Shared NeMuS for a set of coded first-order expressions is a 4-tuple (assuming no functions), , in which is the variable space, is the constant space, is the predicate space and is the clause space.
The next section describes how inductive learning is performed in Amao using NeMuS structure.
3 NeMuS-based Inductive Learning
ICL is based on the concept of Least Herbrand Model (LHM) Lloyd (1993) to anticipate the elimination of inconsistent hypotheses, at each induction step, before they are fully generated. This is done by colliding, via computing the inductive momentum, atoms obtained from bindings of arguments from (candidates to compose LHM) and . Then, a pattern of linkage across verified literals is identified, anti-unification (adapted from Idestam-Almquist (1993)) is applied, and a conjecture is generated. The process repeats until the conjecture becomes a closed and consistent hypothesis.
3.1 Inductive Learning from the Herbrand Base
The problem of inductive learning involves a knowledge base of predicates, called background knowledge (), a set of examples that the logical description of the target concept () should prove (positive examples, ) and a set of examples that the target concept should not prove (negative examples, ). Figure 1 depicts a possible Herbrand Base that may explain how entails a positive example for the concept . In its turn, can be unary (), binary (), etc.
From attribute bindings of , i.e. and possibly , is connected with attribute mates’ bindings. Such connections bridge all concepts they may appear in (, , etc.), like a path in a graph, until it reaches the binding concept of ’s last attribute. The interconnected concepts form a linkage pattern. For example, when , etc., are all the same concept, then a recursive hypothesis may be generated. Invention and hypotheses generation and will always take place in the bridging concepts’ region (including the initial and last binding concepts).
The induction method is based on the following aspects: (a) Inductive momentum that iteratively “selects” only those atoms not likely to entail , (b) linkage patterns among atoms passed (a), based on internal connections (bridging concepts); (c) anti-unification substitutes constants in an atom from the Herbrand Base by variables (and optionally (d) category-first ordering, Schramm et al. (2017), useful when contains monadic definitions of categories).
3.2 Linkage Patterns and Hypothesis
A special form of intersection identifies the common terms between both literals. For instance, in Figure 1, suppose , i.e. bridging concepts has just two literals of the same concept : and . Then .
Definition 3** (Linkage and Hook-Terms)**
Let and be two predicates of a . There is a Linkage between and if a same constant, , appears (at least) once in ground instances of and . We call a hook-term, computed by . The attribute mates w.r.t. an atom , written is a set of terms occurring in , but not in .
With as above, is their hook term and and . This would form the definite clause , which is not generated but it is built using anti-unification to generalize over its hooked ground literals. In the following definition we use the standard notion of a substitution as a set of pairs of variables and terms like .
Definition 4** (Anti-substitution and Anti-unification)**
Let be a first-order expression with no constant term and be free variables of , is a ground first-order expression, and are constants terms of . An anti-substitution is a set such that , and is called a simple anti-unification of . The anti-unification maps a ground atom to its corresponding anti-substitution set that generalizes , i.e.
Note that in the original definition of anti-unification, Idestam-Almquist (1993), should be the generalisation of two ground expressions. Here, as we build a hypothesis by adding literals from a definite clause the definition is
Definition 5** (Anti-unification on linkage terms)**
Given two literals and . A linkage term, say , for their hook term , is a variable that can be placed, by anti-unification, in the hook’s position wherever it appears in the ground instance that will produce two non-ground literals.
The definite clause , from Figure 1 (), is incrementally anti-unified, and so the following general clause is found: .
This concept is the fundamental operation to generate a hypothesis because it generalizes ground formulas into universally quantified ones. Before describing how negative examples are used we have the following definition.
Definition 6** (Hypothesis)**
Let be a formula with no constant term, be a set of ground atoms formed by concepts and constants from a Herbrand universe , is a ground atom with terms (which belongs to the base of constants, ) such that . We say that is a hypothesis for with respect to if and only if there is a set of atoms and a such that for
every of , there is some and 2. 2.
every and is not empty. 3. 3.
when 1 and 2 hold, then .
An open hypothesis is one that at least one term of have not been anti-unified. Thus, 1, 2 and 3 will always generate a closed least hypothesis.
Recall that learning should involve the generation of a hypothesis and to test it against positive and negative examples. Such a “test” could be done while the hypothesis is being generated. This is the fundamental role of the following concept. Note that, in the interest of saving space, the following sections shall use logic notation. It is important, however, to recall the definitions given in section 2.2, as the method described runs on top of the Shared NeMuS structure.
3.3 Inductive Momentum
In the following definition, is a constant originating from the path of a positive example, and is another constant originating from a negative example.
Definition 7** (Inductive Momentum)**
Let and be two I-spaces of atoms and , representing and atomic formulas (literals) in the Herbrand base. If and , from and , such that and , i.e. is an element of and is an element of , then the inductive momentum between and with respect to and is
[TABLE]
Note that if , then it is assumed since they are the same code in the predicate space. When it is clear in the context we shall simply write rather than the T-Node vector notation.
Example 1. is formed by ground instances of binary and monadic predicates (not limited to them), atoms and Herbrand universe as follows.
, 2. 2.
, 3. 3.
, 4. 4.
, 5. 5.
target , with : and : .
When the BK is compiled its correspondent NeMuS structure is also built. The induction mechanism, at each step, adds to the premise of a hypothesis the next available atom from the bindings of a constant only if such an atom “resists” the inductive momentum.
For a partial hypothesis would be with , but the equivalent path from negative example would reach . This would allow to be also deduced, which is not what it is expected from a sound hypothesis. Thus, this hypothesis is dropped. For this example, a sound hypothesis could be .
Had the target concept be and positive example , then a possible hypothesis generated would be
.
3.4 Predicate Invention
Predicate invention, according to ILP definition, is a bias defined by the user via a declarative language. It is a way to deal with predicates missing from the for the lack of information. Suppose that target concept of Figure 1 is , BK is the set . There are two different concept relations in which the constant participates as a second attribute.There can be many instances of both and , and no constant appear as first argument of both. There seems to be new concept that captures the property that all persons share when appearing as the first attribute of either relation.
On a closer look at Figure 2, it is possible that our general approach to predicate invention generates hypotheses that do not look like what we expect. For example given , might generate .
Is is assumed that two concepts, say and , are “specialisations” of another concept whenever there are objects appearing as second argument of both, but can only appear as first in one of them. This is informed to Amao as follows
Consider induction on T knowing E assuming P1 or P2 defines NewP,
to mean that the bias we are looking for is NewP P1 or NewP P2.
We say that an invented predicate bridges two regions of concepts, and so allowing a more simple generalisation of ground rules into hypothesis. This is illustrated in Figure 2.
Every time either or both concepts are involved in a hypothesis generation, the new concept is used to intentionally define the target predicate. So, from the figure above the rule base would be
Of course the new concept is parent and it is not a target concept to consider induction, but shall be used as a bridge or as a base form of a hypothesis, while the target shall be a linear or recursive linkage pattern (in our approach), or tail recursive.
4 Inductive Clause Learning with Invention
The method we are going to present in this section joins all ideas described in section 3. We shall use standard logic program notation for clauses just for readability sake, but recall that Amao language treats as . The general idea of ICL can be summarised in three mains steps.
to walk across the linkages found in the Herbrand Base in order to select atoms as candidates for composing hypotheses, as well as those to oppose the compositions 2. 2.
to compute of atoms as candidates for anti-unification that were selected from positive and negative linkages. 3. 3.
to generalize, via anti-unification, only atomic formulas likely to build consistent hypotheses, i.e. those composed by atoms consistent with respect to
In the following description we shall consider a dyadic theory with no function terms/
4.1 Selecting Candidates to Compose Hypothesis
Given the target , : and : . We access, from the NeMuS of the BK, and . The initial view of the space of possible hypotheses that can be formed using atoms from the Herbrand Base and anti-unification is illustrated in Figure 3.
Each in a triangle represents a hypothesis formation branch that can be expanded following the bindings of the attribute in . Some of them may allow the deduction of , and thus inductive momentum is applied to validate fetched atoms. After adding an anti-unified literal from into the premise of the hypothesis being generated, say , the next induction step will take a branch from the attribute-mates of to compute , generalize and so on. This is a depth-first walk across the Herbrand Base. In the breadth-first walk the generation of is postponed until all triangle branches have been initially exploited. For completeness sake it is implemented breadth-first.
4.2 Computing and Linkages
Accessing the bindings of constants is straightforward, we keep a loop selecting the instances of the literals that appear until the last is verified. Basically it is running while computing and moving across links of the sub-trees (triangles) from Figure 3.
If consistent for all , then
- If and , then
: ,
- Else if and , then : ,
- Otherwise, get another and repeat the process until there are no more elements to test. In this case there is no hypothesis.
For a consistent , then there may exist , and
- a)
, is an atom from the Herbrand Base then for
: . (Chain in ILP)
- b)
: path can only form a long linear linkage pattern. For non dyadic, if is ok then expand hypotheses : . ’s body is added with (see expansion illustrated in Figure 4).
4.3 “Bias” as Invention of Predicates
Amao performs a similarity training on NeMuS’s weights using the vector representation for each constant as well as for literals. Those with similar linkages end up with similar weight values associated to the argument they have and their position within them. Besides, bias may be used to add non targeted new predicates.
Non user bias: “automated” invention
For this, it is necessary “to invent” a predicate, say , such that becomes a closed hypothesis. For the sake of space will be suppressed when anti-substituions are clear.
- : , The invented predicate becomes the head of ”invented hypothesis”, as
- : with , and it becomes the current open hypothesis. The search now is guided by .
User defined bias for invention
When an assumption that defines another concept, say , then ’s body would have and ’s head would have , rather then . This would be something like assuming defines , then
-
: ,
-
:
-
•
Assuming , i.e. both are the same predicate (concept region).
If , and no bias given.
Simple linear linkage pattern (chain)
: ,
“Shallow” recursive linkage pattern (recursive tail)
:
:
The order they are introduced into the set of clauses is unimportant 2. 2.
If
- (a)
For bias and non dyadic theory: long linear linkage pattern of the same concept would generate
: .
Instead, if and region’s weights are similar, then invent of a recursive hypothesis.
: ,
: ,
As there can be many bindings, we close an open hypothesis for each possible combination of bindings. Then, we keep computing the momentum and expanding a new branch for each combination (as explained in sections 4.1 to 4.3).
4.4 A Running Example: the Family Tree
Example 2. Consider the Family Tree, from Muggleton et al. (2015). We may request to Amao the following
consider induction on ancestor(X,Y) knowing ancestor(jake,bob) assuming father(X,Y) or mother(X,Y) defines parent(X,Y).
5 Related Work
Recent advances in Inductive Logic Programming (ILP) ease predicate invention by constraining logical learning operations with higher-order meta-rules, expressions that describe the formats of the rules. These rules have order constraints associated to them (to ensure termination of the proof) and are provided to the meta-interpreter, which attempts to prove the examples. When successful at this task, it then saves the substitutions for existentially quantified variables in the meta-rules Muggleton (2017). This technique has been used to build Metagol Muggleton et al. (2014, 2015), which has been successful in various examples. However, this approach tends to increase the generation of meaningless hypotheses and, consequently, leads to a large hypotheses space. Cropper and Muggleton (2016) tackles this challenge by extending Metagol to support abstractions and invention, but it remains a problem. Amao takes a totally different approach by using NeMuS to perform Inductive Clause Learning (ICL). This work extends ICL by using the results of exploring weights of logical components, already present in NeMuS, to support inductive learning by expanding clause candidates with literals which passed in the inductive momentum check. This allows an efficient invention of predicates, including the learning of recursive hypotheses, while restricting the shape of the hypothesis by adding bias definitions or idiosyncrasies of the language.
6 Discussion and Future Work
This paper has shown how the Amao Shared NeMuS data structure can be used in predicate invention without the need to generate meaningless hypotheses. This is achieved via Inductive Clause Learning, with automatic predicate generation that takes advantage of the degree of importance of constant objects. As atomic object, constants of the Herbrand base had never called much attention for logical inference, but only to validate resolution through unification. Here, we showed how they can be used to guide the search for consistent hypothesis in two ways.
First, by walking across their bindings from positive examples which are not rejected by inductive momentum with bindings of negative examples. Second, we use, as a heuristic to a faster generation of potentially recursive hypotheses, the maps or regions of similarities (item 2 of bias for invention), that constant bindings allow us to compute. Such maps as inductive mechanism is demonstrated in Barreto and Mota (2019).
Future works will focus on making more efficient use of weighted structures of concepts and their composition to allow learning and reasoning of complex formulae, as well as dealing with noise, uncertainty, and possible worlds. We then aim to incorporate deep learning-like mechanisms by taking advantage of the inherently interconnected compound spaces as a sort of layers for convolution when dealing with massive datasets.
Acknowledgement
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Barreto and Mota [2019] Leonardo Barreto and E. de Souza Mota Mota. Self-organized inductive reasoning with nemus. In Artur d’Avila Garcez, Freddy Lecue, and Derek Doran, editors, Ne Sy 2019 . CEUR Workshop Proceedings, August 2019.
- 2Boyer and Moore [1972] R. S. Boyer and J. S. Moore. The Sharing of Structure in Theorem-Proving Programs. In Machine Intelligence 7 , pages 101–116. Edinburgh University Press, 1972.
- 3Cropper and Muggleton [2016] Andrew Cropper and Stephen Muggleton. Learning higher-order logic programs through abstraction and invention. In Proceedings of the Twenty-Fifth IJCAI , pages 1418–1424, 07 2016.
- 4Idestam-Almquist [1993] P. Idestam-Almquist. Generalization under Implication by Recursive Anti-unification. In International Conference on Machine Learning , pages 151–158. Morgan-Kaufmann, 1993.
- 5Lloyd [1993] J. W. Lloyd. Foundations of Logic Programming, Second, Extended Edition . Springer-Verlag, 1993.
- 6Mota and Diniz [2016] E. de Souza Mota and Yan Brandão Diniz. Shared Multi-Space Representation for Neural-Symbolic Reasoning. In Tarek R. Besold, Luis Lamb, Luciano Serafini, and Whitney Tabor, editors, Ne Sy 2016 , volume 1768. CEUR Workshop Proceedings, July 2016.
- 7Mota et al. [2017] E. de Souza Mota, Jacob M. Howe, and Artur d’Avila Garcez. Inductive Learning in Shared Neural Multi-Spaces. In Tarek R. Besold, Artur d’Avila Garcez, and Isaac Noble, editors, Ne Sy 2017 , volume 2003. CEUR Workshop Proceedings, July 2017.
- 8Muggleton et al. [2014] S. H. Muggleton, D. Lin, N. Pahlavi, and A. Tamaddoni-Nezhad. Meta-interpretive learning: application to grammatical inference. Machine Learning , 94(1):25–49, 2014.
