# Canonicalizing Knowledge Base Literals

**Authors:** Jiaoyan Chen, Ernesto Jimenez-Ruiz, Ian Horrocks

arXiv: 1906.11180 · 2019-06-27

## TL;DR

This paper presents a framework that combines reasoning and machine learning to automatically canonicalize literals in knowledge bases, improving their semantic consistency and usability.

## Contribution

It introduces a novel combined reasoning and machine learning approach for canonicalizing literals in KBs, addressing a key quality issue.

## Key findings

- Outperforms state-of-the-art baselines in semantic typing
- Achieves higher accuracy in entity matching
- Demonstrates effective integration of reasoning and machine learning

## Abstract

Ontology-based knowledge bases (KBs) like DBpedia are very valuable resources, but their usefulness and usability is limited by various quality issues. One such issue is the use of string literals instead of semantically typed entities. In this paper we study the automated canonicalization of such literals, i.e., replacing the literal with an existing entity from the KB or with a new entity that is typed using classes from the KB. We propose a framework that combines both reasoning and machine learning in order to predict the relevant entities and types, and we evaluate this framework against state-of-the-art baselines for both semantic typing and entity matching.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.11180/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1906.11180/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/1906.11180/full.md

---
Source: https://tomesphere.com/paper/1906.11180