PoWareMatch: a Quality-aware Deep Learning Approach to Improve Human   Schema Matching

Roee Shraga; Avigdor Gal

arXiv:2109.07321·cs.DB·September 16, 2021

PoWareMatch: a Quality-aware Deep Learning Approach to Improve Human Schema Matching

Roee Shraga, Avigdor Gal

PDF

Open Access 1 Repo

TL;DR

PoWareMatch is a deep learning-based system that enhances human schema matching by calibrating and filtering decisions, leading to higher quality data integration matches and outperforming existing algorithms.

Contribution

This work introduces PoWareMatch, a novel approach combining human judgment and deep learning to improve schema matching quality and reliability.

Findings

01

PoWareMatch accurately predicts the benefit of adding correspondences.

02

It generates higher quality matches than existing algorithms.

03

Empirical validation with over 200 human matchers supports its effectiveness.

Abstract

Schema matching is a core task of any data integration process. Being investigated in the fields of databases, AI, Semantic Web and data mining for many years, the main challenge remains the ability to generate quality matches among data concepts (e.g., database attributes). In this work, we examine a novel angle on the behavior of humans as matchers, studying match creation as a process. We analyze the dynamics of common evaluation measures (precision, recall, and f-measure), with respect to this angle and highlight the need for unbiased matching to support this analysis. Unbiased matching, a newly defined concept that describes the common assumption that human decisions represent reliable assessments of schemata correspondences, is, however, not an inherent property of human matchers. In what follows, we design PoWareMatch that makes use of a deep learning mechanism to calibrate and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shraga89/powarematch
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Data Management and Algorithms