SiGMa: Simple Greedy Matching for Aligning Large Knowledge Bases
Simon Lacoste-Julien, Konstantina Palla, Alex Davies, Gjergji Kasneci,, Thore Graepel, Zoubin Ghahramani

TL;DR
SiGMa is a scalable, greedy algorithm that efficiently aligns large knowledge bases by leveraging structural and property similarities, outperforming existing methods in accuracy and speed.
Contribution
The paper introduces SiGMa, a simple yet effective greedy algorithm for large-scale knowledge base alignment that combines structural and property-based similarity measures.
Findings
SiGMa achieves high precision in aligning large knowledge bases.
It outperforms state-of-the-art methods in accuracy and efficiency.
The algorithm is scalable to knowledge bases with millions of entities.
Abstract
The Internet has enabled the creation of a growing number of large-scale knowledge bases in a variety of domains containing complementary information. Tools for automatically aligning these knowledge bases would make it possible to unify many sources of structured knowledge and answer complex queries. However, the efficient alignment of large-scale knowledge bases still poses a considerable challenge. Here, we present Simple Greedy Matching (SiGMa), a simple algorithm for aligning knowledge bases with millions of entities and facts. SiGMa is an iterative propagation algorithm which leverages both the structural information from the relationship graph as well as flexible similarity measures between entity properties in a greedy local search, thus making it scalable. Despite its greedy nature, our experiments indicate that SiGMa can efficiently match some of the world's largest knowledge…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Data Quality and Management · Topic Modeling
