General Context-Aware Data Matching and Merging Framework
Slavko \v{Z}itnik, Lovro \v{S}ubelj, Dejan Lavbi\v{c}, Olegas, Vasilecas, Marko Bajec

TL;DR
This paper presents a comprehensive, domain-independent framework for data matching and merging that incorporates multiple context dimensions, semantic enrichment, and trust metrics, validated across diverse datasets.
Contribution
The paper introduces a novel, general framework for data matching and merging that considers multiple context types and includes new metrics for framework management.
Findings
Framework achieves improved results across five diverse datasets
Introduction of new attribute, relationship, semantic, and trust metrics
Framework demonstrates domain independence and robustness
Abstract
Due to numerous public information sources and services, many methods to combine heterogeneous data were proposed recently. However, general end-to-end solutions are still rare, especially systems taking into account different context dimensions. Therefore, the techniques often prove insufficient or are limited to a certain domain. In this paper we briefly review and rigorously evaluate a general framework for data matching and merging. The framework employs collective entity resolution and redundancy elimination using three dimensions of context types. In order to achieve domain independent results, data is enriched with semantics and trust. However, the main contribution of the paper is evaluation on five public domain-incompatible datasets. Furthermore, we introduce additional attribute, relationship, semantic and trust metrics, which allow complete framework management. Besides…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Access Control and Trust · Service-Oriented Architecture and Web Services
