Towards Ranking Schemas by Focus

Mattia Fumagalli; Daqian Shi; Fausto Giunchiglia

arXiv:2302.13591·cs.AI·February 28, 2023

Towards Ranking Schemas by Focus

Mattia Fumagalli, Daqian Shi, Fausto Giunchiglia

PDF

Open Access

TL;DR

This paper introduces a method to evaluate and rank knowledge base schemas based on their focus, defined as relevance for storing and retrieving information, using knowledge metrics and applied to over 200 schemas.

Contribution

It formalizes the concept of focus for knowledge schemas and proposes a ranking methodology based on knowledge metrics, validated on a large set of schemas.

Findings

01

The approach effectively ranks schemas by focus.

02

Experimental results demonstrate the method's utility.

03

Over 200 schemas were evaluated successfully.

Abstract

The main goal of this paper is to evaluate knowledge base schemas, modeled as a set of entity types, each such type being associated with a set of properties, according to their focus. We intuitively model the notion of focus as ''the state or quality of being relevant in storing and retrieving information''. This definition of focus is adapted from the notion of ''categorization purpose'', as first defined in cognitive psychology, thus giving us a high level of understandability on the side of users. In turn, this notion is formalized based on a set of knowledge metrics that, for any given focus, rank knowledge base schemas according to their quality. We apply the proposed methodology to more than 200 state-of-the-art knowledge base schemas. The experimental results show the utility of our approach

Tables3

Table 1. Table 1 : KBSs ranking

KBS	$C u e_{s} (S)$	$C u e_{s r} (S)$	$F o c u s (K)$
freebase:	8981	0,21	1,15
cal:	46	0,98	0,92
bibo:	71	0,97	0,92
opencyc-l:	6266	0,26	0,90
swpo:	87	0,88	0,83
cwmo:	107	0,85	0,80
eli:	62	0,84	0,78
ncal:	103	0,80	0,75
mo:	124	0,79	0,74
akt:	106	0,79	0,74

Table 2. Table 2 : Entity types ranking

KBS	entity type	$C u e_{e} (e)$	$C u e_{e r} (e)$	$F o c u s (e)$
DBpedia:	person	169,02	0,69	1,42
opencyc-l:	firstordercoll.	230,59	0,30	1,30
freebase:	statisticalreg.	161,53	0,48	1,17
opencyc-l:	class	194,95	0,31	1,15
dicom:	ieimage	158,90	0,44	1,13
DBpedia:	place	116,97	0,63	1,13

Table 3. Table 3 : KBSs ranking for the entity type person

KBS	entity type	$C u e_{e} (e)$	$C u e_{e r} (e)$	$F o c u s (e)$
DBpedia:	person	169,02	0,69	1,42
akt:	person	8,00	1,00	1,03
opencyc-l:	person	122,14	0,43	0,95
vivo:	person	10,60	0,88	0,92
swpo:	person	3,50	0,88	0,88
cwmo:	person	5,83	0,83	0,85

Equations16

C u e_{p} (p, e) = \frac{P o E ( p , e )}{∣ d o m ( p ) ∣} = c \in [0, 1] \vspace - 0.05 c m

C u e_{p} (p, e) = \frac{P o E ( p , e )}{∣ d o m ( p ) ∣} = c \in [0, 1] \vspace - 0.05 c m

P o E (p, e) = {1, i f e \in d o m (p) 0, o t h er w i se \vspace - 0.05 c m

P o E (p, e) = {1, i f e \in d o m (p) 0, o t h er w i se \vspace - 0.05 c m

C u e_{e} (e) = i = 1 \sum ∣ p r o p (e) ∣ C u e_{p} (p_{i}, e) = c \in [0, ∣ p r o p (e) ∣] \vspace - 0.05 c m

C u e_{e} (e) = i = 1 \sum ∣ p r o p (e) ∣ C u e_{p} (p_{i}, e) = c \in [0, ∣ p r o p (e) ∣] \vspace - 0.05 c m

C u e_{er} (e) = \frac{C u e _{e} ( e )}{∣ p r o p ( e ) ∣} = c \in [0, 1] \vspace - 0.05 c m

C u e_{er} (e) = \frac{C u e _{e} ( e )}{∣ p r o p ( e ) ∣} = c \in [0, 1] \vspace - 0.05 c m

C u e_{k} (K) = i = 1 \sum ∣ E_{K} ∣ C u e_{e} (e_{i}) = ∣ p r o p (K) ∣ \vspace - 0.05 c m

C u e_{k} (K) = i = 1 \sum ∣ E_{K} ∣ C u e_{e} (e_{i}) = ∣ p r o p (K) ∣ \vspace - 0.05 c m

C u e_{k r} (K) = ∣ p r o p (K) ∣/ i = 1 \sum ∣ E_{K} ∣ p r o p (e_{i}) = c \in [0, 1] \vspace - 0.05 c m

C u e_{k r} (K) = ∣ p r o p (K) ∣/ i = 1 \sum ∣ E_{K} ∣ p r o p (e_{i}) = c \in [0, 1] \vspace - 0.05 c m

F oc u s (e) = C u e_{e}^{^{'}} (e) + C u e_{er}^{^{'}} (e) \vspace - 0.05 c m

F oc u s (e) = C u e_{e}^{^{'}} (e) + C u e_{er}^{^{'}} (e) \vspace - 0.05 c m

F oc u s (K) = C u e_{k}^{^{'}} (K) + C u e_{k r}^{^{'}} (K) \vspace - 0.05 c m

F oc u s (K) = C u e_{k}^{^{'}} (K) + C u e_{k r}^{^{'}} (K) \vspace - 0.05 c m

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Biomedical Text Mining and Ontologies · Advanced Text Analysis Techniques

MethodsBalanced Selection

Full text

Towards Ranking Schemas by Focus

Mattia Fumagalli1111Contact Author

Daqian Shi1

Fausto Giunchiglia1 1DISI - University of Trento

{mattia.fumagalli, daqian.shi, fausto.giunchiglia}@unitn.it

Abstract

The main goal of this paper is to evaluate knowledge base schemas, modeled as a set of entity types, each such type being associated with a set of properties, according to their focus. We intuitively model the notion of focus as “the state or quality of being relevant in storing and retrieving information”. This definition of focus is adapted from the notion of “categorization purpose”, as first defined in cognitive psychology, thus giving us a high level of understandability on the side of users. In turn, this notion is formalized based on a set of knowledge metrics that, for any given focus, rank knowledge base schemas according to their quality. We apply the proposed methodology on more than 200 state-of-the-art knowledge base schemas. The experimental results show the utility of our approach.222Code and data will be released upon acceptance.

1 Introduction

Following contemporary descriptions by psychologists, the purpose of what we call categorization can be reduced to a “…a means of simplifying the environment, of reducing the load on memory, and of helping us to store and retrieve information efficiently” Klapper et al. (2017); Rosch (1999); Harnad (2017). Loosely speaking, categorizing consists of putting things (events, items, objects, ideas, people) into categories (e.g., classes or types) based on their similarities, or common features. Without categorization we would be overwhelmed by the huge amount of diverse information coming from the external environment and our mental life would be chaotic Millikan (2000); Giunchiglia and Fumagalli (2016).

In AI the purpose of categorization is usually implemented by well-defined and effective information objects, namely knowledge bases schemas (KBSs); prominent examples include knowledge graphs (KGs) schema layers Qiao et al. (2016) and ontologies Guarino et al. (2009). KBSs offer many pivotal benefits Paulheim (2017), such as: i) human understandability; ii) a fixed and discrete view over a stream of multiple and diverse data; iii) a tree or a grid structure, so that each information can be located by answering a determinate set of questions in order; and iv) an encoding in a formal language, which is a fragment of the first first-order predicate calculus. These benefits allow to represent high-performance solutions to large scale categorization problems, namely problems of efficient information storage and retrieval.

KBSs are the backbone of many semantic applications and play a central role in improving the efficiency of many “categorization systems” (like digital libraries or online stores). Unfortunately, their construction involves a very huge effort in terms of time and domain specific knowledge (see for instance well-known problems as “knowledge acquisition bottleneck” Shadbolt et al. (2015)). So far, in order to minimise the effort in building KBSs, a huge number of search engines, catalogs and metrics have been produced, to facilitate their reuse McDaniel and Storey (2019). The facilities provided by these solutions are very valuable especially in the context of their structural evaluation. However, there is still a lot of work to be done for their functional, or purpose-driven, evaluation, in particular to what concerns their ranking based on the relevance333“Something (A) is relevant to a task (T) if it increases the likelihood of accomplishing the goal (G), which is implied by T” Hjørland and Christensen (2002). of their concepts, and their categorization purpose Butt et al. (2016). The main motivation behind this situation is that most of the purpose-driven features of KBSs are extremely difficult to quantify and they are very context dependent Gangemi et al. (2005). Just consider, for instance, the essential role of the qualifier “core” for describing ontologies and its multiple possible interpretations Falbo et al. (2016)). As the number of available KBSs increases, this issue is bound to get worse. There is an increasing need for more and more KBSs to be constructed and made available, and, as this situation occurs, their purpose-driven evaluation for facilitating their reuse, becomes an even greater problem.

The main goal of this paper is to evaluate KBSs, modeled as a set of entity types, each such type being associated with a set of properties, according to their focus, where we intuitively model the notion of focus as “the state or quality of being relevant in storing and retrieving information”. This definition of focus is adapted from the notion of categorization purpose, as first defined in cognitive psychology, thus giving us a high level of understandability on the side of users.

In order to support an accurate level of KBSs reuse, we then propose a solution to rank KBSs that is articulated into three main contributions:

a cognitive psychology grounded account of the notion of focus (Section 2); 2. 2.

a set of metrics that apply to KBSs, their entity types, and their properties, which can be used to rank KBSs according to their focus (Section 3); 3. 3.

an analysis of the application of the metrics over a huge amount of state-of-the-art (SoA) data sets (Section 4).

The evaluation (Section 5) confirms the validity of the approach. Section 6 discusses the related work, Section 7 the conclusions.

2 Defining focus

Imagine that by saying “the green book on my desk in my office” someone wants someone else to bring her that book. This will happen only if the two subjects share a way of describing objects into those that are offices and those that are not, those that are books and those that are non-books, green things and non-red things, desks and non-desks. These “object descriptions” are what is meant to convey for retrieving the intended objects. The point is to draw sharp lines around the group of objects to be described. That is the categorization purpose of an object description. These object descriptions, also called types, categories or classes, are the basis of the organization of our mental life. Meaning and communication heavily depend on this categorization Millikan (2000, 2017); Fumagalli et al. (2019).

Following the contemporary descriptions by psychologists, and, in particular, the seminal work by Eleanor Rosch Rosch (1999), the categorization purpose of objects descriptions or categories, can be explained according to two main dimensions, namely: i) the maximization of the number of features that describe the members of the given category and ii) the minimization of the number of features shared with other categories.

To evaluate these dimensions Eleanor Rosch introduces the central notion of cue validity Rosch and Mervis (1975). This notion was defined as “*the conditional probability * $p(c_{j}|f_{j})$ that an object falls in a category $c_{j}$ given a feature, or cue, $f_{j}$ ”, and then used to define the set of basic level categories, namely those categories which maximize the number of characteristics (i.e., features or attributes like “having a tail” and “being colored”) shared by their members and minimize the number of characteristics shared with the members of their sibling categories. The intuition is that basic level categories have higher cue validity and, because of this, they are more relevant in categorization.

Rosch’s definitions were designed for experiments where humans were asked to score objects as members of certain given categories. We adapt Rosch’s original methodology to the context of knowledge base schema design. In our setting, each available KBS (see, for instance, schema.org444http://schema.org/ or DBpedia555https://wiki.dbpedia.org/) plays the role of a categorization, which is modeled as a set of entity types associated to a set of properties, whose main function is to draw sharp lines around the types of entities it contains, so that each member in its domain falls determinately either in or out of each entity type Giunchiglia and Fumagalli (2020); Giunchiglia and Fumagalli (2019). The knowledge engineers play a similar role of the persons involved in Rosch’s experiment. Each knowledge base schema provides a rich set of categorization examples. Each entity type plays the role of a category and all entity type properties play the role of features. The categorization purpose of the KBS is then mapped into the notion of focus. We intuitively model the notion of focus as “the state or quality of being relevant in storing and retrieving information” and we quantify the degree of this relevance by adapting the Rosch’s notion of cue validity as follows:

•

we take each property has having the same “cue validity” (which we assume to be normalized to one);

•

for each KBS we equally divide the property “cue validity” across the entity types the properties are associated to;

•

by checking the wide-spreading of “cue validity” we quantify the relevance of the KBS and entity types in storing and retrieving information.

The “focus” can be then calculated in relation to this analysis and, in turn, it can be functionally articulated in:

•

the entity types focus, namely, what allows to identify the entity types that are maximally informative categories, which have a higher categorization relevance, or, more precisely, which maximise the amount of properties and minimize the number of properties shared with other categories. These entity types are, in some extent, related to what expert users consider as “core entity types” or central entity types for a given domain;

•

the KBSs focus, namely, what allows to identify the KBSs that maximise the number of maximally informative (focused) entity types. These KBSs being described, in some extent, as “clean” or ‘not-noisy” and are related to what expert users classify as well-designed KBSs Paulheim (2017).

3 Focus metrics

We assume that a KBS can be formalized as: $K=\left\langle E_{K},P_{K},I_{K}\right\rangle$ , with $E_{K}=\left\{e_{1},...,e_{n}\right\}$ being the set of entity types of K, $P_{K}=\left\{p_{1},...,p_{n}\right\}$ being the set of properties of K, and $I_{K}$ being a binary relation $I_{K}\subseteq E_{K}\times P_{K}$ that expresses which entity types are associated to which properties. We say that an entity type is associated to a property when the latter is used to describe the former, and that a property is associated to an entity type with the dual meaning. We also talk of an entity e being in the domain of a property p, in formulas e $\in$ dom(p), when e is associated with p. Thus, for instance, the entity Person can be in the domain of properties such as address or name, while the property address may be associated with entities such as Person, or Building. Notice how, the proposed formalization of entities and properties is different from, e.g., the OWL666https://www.w3.org/2001/sw/WebOnt/ representational language. The key difference can be clarified by considering our formalism as very similar to what is proposed by the Formal Concept Analysis (FCA) methods Ganter and Wille (2012). Our commitment to this model is motivated not only on foundational considerations, but also on pragmatical grounds. In fact, once properties and entities are encoded as described above, data can be analyzed and processed with very few limitations.777See Goyal and Ferrara (2018) for an overview of the multiple available approaches and applications

Given the above formalization we define a main set of metrics to evaluate KBSs according to their focus.

Firstly, we define the cue validity of a property p w.r.t to an entity e, also called $cue_{p}-validity$ , as:

[TABLE]

with $\mid\textit{X}\mid$ being the cardinality of the set X and PoE(p, e) being defined as:

[TABLE]

$Cue_{p}(p,e)$ returns 0 if p is not associated with e and 1/n, where n is the number of entities in the domain of p, otherwise. In particular, if p is associated to only one entity type its $cue_{p}-validity$ is maximum and equal to one.

Given the notion of $cue_{p}-validity$ we define the notion of cue validity of an entity type, also called $cue_{e}-validity$ , as the sum of the cue validities of the properties associated with the entity, namely:

[TABLE]

$Cue_{e}-validity$ provides the centrality of an entity in a given KBS, by summing all its properties cues. More this value is high, more the entity type maximize the number of its properties with the members it categorises.

Given the notion of $Cue_{e}-validity$ , we capture the level of minimisation of the number of properties shared with other entity types, inside a KBS with the notion of $cue_{er}-validity$ , which we define as:

[TABLE]

The notions and terminology used for entity types, i.e., the notions of $Cue_{e}$ and $Cue_{er}$ , can be straightforwardly generalized to KBSs, generating the following metrics:

[TABLE]

The $Cue_{k}(K)$ is calculated as a summation of the cues of all the entity types of a given KBS, i.e., $E_{K}$ , and returns the number of the properties of the KBS, i.e., $|prop(K)|$ . Following the formalization of $Cue_{er}$ we capture the level of minimisation of the number of properties shared across the entity types inside the schema with the notion of $cue_{kr}-validity$ , which we define as:

[TABLE]

Given the definitions of Cues for entity types, we capture the categorization relevance of entity types, i.e., the entity types that maximise the amount of properties and minimise the number of shared properties, with the notion of $Focus(e)$ , which we define as follows:

[TABLE]

where $Cue_{e}^{{}^{\prime}}(e)$ is a log normalization Bornemann et al. (1981) of $Cue_{e}(e)$ and $Cue_{er}^{{}^{\prime}}(e)$ is a min-max normalization Jain and Bhandare (2011) of $Cue_{er}(e)$ , considering all the $Cue_{er}(e)$ and $Cue_{e}(e)$ values from a given set of KBSs $\left\{K\right\}$ .

Similarly, given the definitions of Cues for KBSs, we capture the categorization relevance of KBSs, i.e., the KBSs that maximise the number of maximally informative entity types, with the notion of $Focus(K)$ , which we define as follows:

[TABLE]

where $Cue_{k}^{{}^{\prime}}(K)$ is a log normalization of $Cue_{k}(K)$ and $Cue_{kr}^{{}^{\prime}}(K)$ is a min-max normalization of $Cue_{kr}(K)$ , considering all the $Cue_{k}(K)$ and $Cue_{kr}(K)$ values from a given set of KBSs $\left\{K\right\}$ . Notice that we used log normalization for $Cue_{e}(e)$ and $Cue_{k}(K)$ because of the wide range of $Cue_{e}(e)$ and $Cue_{k}(K)$ values across the KBSs (e.g., we may have a KBS with $Cue_{e}(e)=10$ and a KBS with $Cue_{e}(e)=300$ ).

4 Measuring focus

We started from a data set of around 700 KBSs, expressed in the Terse RDF Triple Language (Turtle)888https://www.w3.org/TR/turtle/. format. Most of these resources have been taken from the LOV999https://lov.linkeddata.es/dataset/lov catalog. The remaining ones, see for instance freebase101010https://developers.google.com/freebase and SUMO111111http://www.adampease.org/OP/ have been added to collect more data.

For the sake of the analysis, all the data sets have been flattened into a set of sets of triples (one set per entity type), where each triple encodes information about “entitytype-property” associations $I_{K}(e)$ (e.g., the triple “Person-domainOf-friend” encodes the “Person-friend” $I_{K}(e)$ association. Moreover, in order to generate the final output data sets we processed properties labels via NLP pipeline which performs various steps, including, for instance: i) split a string every time a capital letter is encountered (e.g., birthDate $\rightarrow$ birth and date); ii) lower case all characters; iii) filter out stop-words (e.g., hasAuthor $\rightarrow$ author). This allowed us to run a more accurate analysis. For instance, if “Person” and “Place” have properties like “globalLocationNumberInformation” and “LocationNumber”, respectively, by processing the labels like we have done is possible to find some overlapping (see “location” and “number”) otherwise no.

Due to the lack of space, after the above processing we selected a subset of the starting data set, by discharging all the KBSs with less than 30 entity types. An overall view of the final output data set is provided by Figure 1, where, for each of the 44 KBSs, the number of properties, the number of entity types and the balance are provided. The balance returns the value of a simple distribution of the properties of a KBS across its entity types and it is calculated as $\frac{|prop(K_{i})|}{|E_{K_{i}}|}\ast\frac{1}{|prop(e_{i})|_{max}}$ , with $|prop(K_{i})|$ being the cardinality of the set of properties of the KBS, $|E_{K_{i}}|$ being the cardinality of the set of entities of the KBS and $|prop(e_{i})|_{max}$ being the cardinality of the set of properties associated to the entity with the major number of properties in the KBS.

By applying the cue entity metrics, i.e., $Cue_{e}(e)$ and $Cue_{er}(e)$ to the KBSs of the resulting list, we obtained the scores to evaluate the categorization relevance of the entity types for each KBS. Let us take, for instance the values provided by KBSs in (Fig. 2). We randomly selected eight KBSs from the starting set and we listed them according to the number of entity types. The selected KBSs are: freebase, opencyc-light121212https://pythonhosted.org/ordf/ordf_vocab_opencyc.html, DBpedia, SUMO, schema.org, md131313http://def.seegrid.csiro.au/isotc211/iso19115/2003/metadata, pext141414http://www.ontotext.com/proton/protonext.html and ludo-gm151515http://ns.inria.fr/ludo/v1/docs/gamemodel.html. The corresponding scatter plots provide the correlations between (a min-max normalization of) $Cue_{e}(e)$ and $Cue_{er}(e)$ for each entity type of each of the selected KBSs. The top-right entity types are the ones with the higher categorization relevance according to our metrics. For instance, in SUMO we have entity types like GeopoliticalArea and GeographicalArea and in DBpedia we have Person and Place.

By applying the $Focus(K)$ over the set of 44 KBSs we obtained the KBS ranking, where the top 11 KBSs are reported in Tab. 1. By applying $Focus(e)$ over the set of 44 KBSs we obtained the entity types ranking, where the top 6 entity types in terms of categorization relevance are reported in Tab. 2. Finally by selecting a given entity type, by applying $Focus(e)$ , it is possible to find the best KBS for that entity type. Table 3 provides an example for the entity type Person.

5 Focus evaluation

We organize the evaluation in two parts. In Section 5.1 we analyse the accuracy of the $Focus(e)$ metric in weighting the categorization relevance of entity types, namely their centrality in the maximization of information. This will be done by applying our metrics and some related SoA ranking algorithms over a set of example KBSs. Then we compare the results with a reference data set generated by 5 knowledge engineers, to which we provided a set of instructions/ guidelines to rank the entity types, taking inspiration from Rosch’s experiment Rosch (1999).

In Section 5.2, given the lack of baseline metrics for calculating the overall score of a KBS on similar functions, and the lack of reference gold standards, we analyse the effects that the $Focus(K)$ of a KBS may have on the prediction performance of a relational classification task, thus showing a possible practical application of the given measure.

5.1 Evaluating Focus(e)

The question here is whether $Focus(e)$ allows to rank entity types in KBSs according to their categorization relevance, as described in Section 2. To evaluate our metric we firstly selected a subset of the KBSs discussed in the previous section, namely akt161616https://lov.linkeddata.es/dataset/lov/vocabs/akt, cwmo171717https://gabriel-alex.github.io/cwmo/, ncal, pext, schema.org, spt181818https://github.com/dbpedia/ontology-tracker/tree/master/ontologies/spitfire-project.eu and SUMO. We selected these KBSs because they provide very different examples in terms of number of properties, entities, balance and cues. Moreover almost all their entity types labels are human understandable191919A lot of KBSs have entity types labels codified by an ID.. As second step we selected four SoA ranking algorithms, namely TF-IDF Salton and Buckley (1988), BM25 Robertson et al. (1995), Class Match Measure (CMM) and Density Measure (DEM) Alani and Brewster (2006). We used the performance of these rankings as baseline, by selecting their scores for the top 10 entity types, for each of the given KBSs, and we compared them with the rankings provided by $Focus(e)$ . The relevance of our approach was then measured in terms of accuracy (from 0 to 1) by checking how many entity types of the ranking results are in the entity types ranking lists provided by the knowledge engineers. The output of this experiment is represented by the data in Figure 3.

Looking at the chart, the line represents the accuracy of the ranking trend provided by our $Focus(e)$ . Each bar represents the accuracy of the ranking, for each selected SoA algorithm. All the accuracy results are grouped w.r.t. the reference KBS.

The first main observation is that all the reference SoA metrics show a very similar trend, with higher accuracy for akt, cwmo, ncal and spt, and lower accuracy for schema.org and SUMO. This is not the case for $Focus(e)$ . Our metric, indeed, even if it is not the best for all the KBSs, performs best with huge and very noisy (with lower $Cue_{cr}$ , and then lower entity types $Cue_{er}$ ) KBSs, as it is the case for schema.org and SUMO (just check the visualization of SUMO and schema.org as in Figure 2 to observe the phenomenon). This, as we expected, depends on the pivotal role we gave to the minimization of the number of overlapping properties. The $Cue_{er}$ for each entity type provides indeed essential information about the categorization relevance that, giving more importance to the number of properties of an entity type, may not be properly identified. Thus, given small and not-noisy (or “clean”) KBSs, other approaches, very focused on the number of properties of entity types pay very well (see the good performance of the TF-IDF algorithm). Differently, when KBSs present a huge amount of entity types, with low $Cue_{er}$ , $Focus{e}$ allows to identify better the categorization relevance.

The second main observation is that TF-IDF and $Focus(e)$ are the best metrics in terms of average performance, namely 0.52 (both TF-IDF and $Focus(e)$ ) mean accuracy vs. 0.47 for bm25 and CMM, and 0.44 for DEM. This score being motivated by the fact that TF-IDF is almost always the best when the given KBS is small and “clean” and $Focus(e)$ compensates the standard performance with small and clean KBSs, with a high performance with huge and noisy KBSs.

5.2 Evaluating Focus(K)

The question here is whether Focus(K) helps to predict the performance of KBSs in their ability to predict their own entity types. In this experiment we used the same KBSs we selected in the previous experiment to address relational classification, where entity types have an associated label and the task is to predict those labels. Notice that we addressed a specific type of relational classification, namely an entity type recognition task (ETR), as defined in Giunchiglia and Fumagalli (2020). We set-up the experiment as follows: i) we trained a decision tree and a k-NN Kamiński et al. (2018); Dasarathy (1991) model with each KBS converted into FCA format (see details provided in Section 4) and performed standard nested cross-validation (with 50 folds); ii) we reported the relative performance of the models in terms of differences in accuracy and compared the performances with the $Focus(K)$ for each of the given KBSs. The results are as in Figure 4 below.

Looking at the chart, the accuracy is reported as a proportion of correct predictions, within the range of [0%,100%] (see charts bars). The $Focus(K)$ is reported by the values of the line. The cwmo KBS is the one with the best scores, in terms of accuracy (for both the trained models) and $Focus(K)$ . schema.org is the worst.

The main observation is that, as expected, the trend in terms of accuracy, considering both the two models, follows the $Focus(K)$ ranking for most of the given KBSs. However, it can be noticed that k-NN, with the pext KBS represents an exception, it is indeed worse than akt in terms of $Focus(K)$ , but performs better with k-NN. Going deep into the analysis, this phenomenon can be explained by the relationship between the number of properties and the number of entity types, more specifically by the balance of the KBS. This value can indeed affect the performance of the model in prediction. The more the balance the more the probability of having entity types with a low focus. This effect being quite evident if we consider two KBSs with very similar $Focus(K)$ , but very different balance.

This experiment, while showing how $Focus(K)$ can be a concrete explanation of the categorization relevance of a KBS, suggests the possibility of a practical application of $Focus(K)$ to evaluate the potential performance of a KBS or a set of KBSs in a relational classification task. The results may be used, e.g., to fine-tune KBSs in an open-world data integration scenario.

6 Related work

This work shares with the work on ontology and knowledge graph (KG) schema (functional) evaluation McDaniel and Storey (2019); Paulheim (2017); Brank et al. (2005); Gangemi et al. (2005) the goal of facilitating the reuse of these knowledge structures. This work has been extensive and has exploited a huge amount of methods and techniques including, e.g. DWRank Butt et al. (2016) and the NCBO Martínez-Romero et al. (2017) (the former being a high precision recommender for biomedical ontology, the latter being a “learning to rank approach” based on search queries).

Our work differs from this in two major respects. The first is that we ground our approach and the notion of focus on the notion of categorization purpose from cognitive psychology. The theoretical underpinning of our formalization of the metrics and the experimental setup, is then inspired by the analysis of human behavior in categorization, and in particular by the seminal work by E. Rosch. Our goal is not to redefine terminology already in use in the related work, but rather to propose a both theoretically and practically useful formalization of the central activity of categorization, which can be considered as the baseline of each knowledge engineering task. The second difference, which is actually a consequence of the first, is that, while most of the functional evaluation approaches are related to the intended use of a given KBS, and consider functional dimensions, like task and domain, which are very context dependent, this is not the case with our approach. The notion of focus we adapted, indeed, aims to model a privileged level of categorization, independently from the tasks and the domain of application of the data structure. This in turn allows us to devise a somewhat opposite approach. In fact, the domain of a KBS can be then identified through the focus scores. For instance, the fact that a KBS has a high focus for entity types like CreativeWork or Product, will help the user to understand what is the real potential of that KBS for a given domain of application.

As last consideration, it is important to observe how the notion of cue validity has been widely studied in the context of feature engineering. Together with other similar measures as “category utility” or “mutual information” and, it has been used to measure the informativeness of a category Peng et al. (2005). Our approach differs from the related work in the application of Rosch’s notion at the KBS level, rather than on data. Moreover, the introduction of the “overall” Focus metrics to rank categorization relevance is a novel contribution.

7 Conclusion

In this paper, we have proposed a formal method to evaluate KBSs according to their focus, namely, what cognitive psychologists call categorization purpose. This in turn has allowed us to describe how this evaluation plays an important role in supporting an accurate level of KBSs understanding and reuse. The future work will concentrate on an extension of the proposed metrics, possibly by considering the hierarchical structure of KBSs, an extension of the experimental set-up and an implementation of the metrics for supporting the search engine of a large number of existing high-quality KBSs.

Bibliography31

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Alani and Brewster [2006] Harith Alani and Christopher Brewster. Metrics for ranking ontologies. 2006.
2Bornemann et al. [1981] E Bornemann, JH Doveton, et al. Log normalization by trend surface analysis. The log analyst , 22(04), 1981.
3Brank et al. [2005] Janez Brank, Marko Grobelnik, and Dunja Mladenic. A survey of ontology evaluation techniques. In Proceedings of the conference on data mining and data warehouses (Si KDD 2005) , pages 166–170. Citeseer Ljubljana, Slovenia, 2005.
4Butt et al. [2016] Anila Sahar Butt, Armin Haller, and Lexing Xie. Dwrank: Learning concept ranking for ontology search. Semantic Web , 7(4):447–461, 2016.
5Dasarathy [1991] Belur V Dasarathy. Nearest neighbor (nn) norms: Nn pattern classification techniques. IEEE Computer Society Tutorial , 1991.
6Falbo et al. [2016] RA Falbo, MP Barcellos, FB Ruy, G Guizzardi, and RSS Guizzardi. Ontology pattern languages. In Ontology Engineering with Ontology Design Patterns: Foundations and Applications . IOS Press, 2016.
7Fumagalli et al. [2019] Mattia Fumagalli, Gábor Bella, and Fausto Giunchiglia. Towards understanding classification and identification. In Pacific Rim International Conference on Artificial Intelligence , pages 71–84. Springer, 2019.
8Gangemi et al. [2005] Aldo Gangemi, Carola Catenacci, Massimiliano Ciaramita, and Jos Lehmann. A theoretical framework for ontology evaluation and validation. In SWAP , volume 166, page 16. Citeseer, 2005.