Background Knowledge in Schema Matching: Strategy vs. Data

Jan Portisch; Michael Hladik; Heiko Paulheim

arXiv:2107.00001·cs.DB·July 2, 2021

Background Knowledge in Schema Matching: Strategy vs. Data

Jan Portisch, Michael Hladik, Heiko Paulheim

PDF

1 Repo

TL;DR

This paper evaluates the impact of different background knowledge sources and strategies on schema matching, finding explicit strategies outperform latent ones and BabelNet as a consistently strong resource.

Contribution

It systematically compares six knowledge graphs and three exploitation strategies, highlighting the importance of strategy choice over dataset selection in schema matching.

Findings

01

Explicit strategies outperform latent ones.

02

Strategy choice impacts results more than background dataset.

03

BabelNet provides consistently good matching performance.

Abstract

The use of external background knowledge can be beneficial for the task of matching schemas or ontologies automatically. In this paper, we exploit six general-purpose knowledge graphs as sources of background knowledge for the matching task. The background sources are evaluated by applying three different exploitation strategies. We find that explicit strategies still outperform latent ones and that the choice of the strategy has a greater impact on the final alignment than the actual background dataset on which the strategy is applied. While we could not identify a universally superior resource, BabelNet achieved consistently good results. Our best matcher configuration with BabelNet performs very competitively when compared to other matching systems even though no dataset-specific optimizations were made.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

janothan/bk-strategy-vs-data-supplements
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.