Learning to Characterize Matching Experts
Roee Shraga, Ofra Amir, Avigdor Gal

TL;DR
This paper introduces a framework to identify and characterize reliable human matching experts in data integration, enhancing accuracy by filtering out less trustworthy matchers using novel features and empirical validation.
Contribution
The paper presents a novel framework and features for characterizing human matching experts, improving data matching accuracy by filtering inexpert matchers.
Findings
The approach effectively identifies reliable human matchers.
Filtering inexpert matchers improves overall matching results.
Empirical evaluation demonstrates the framework's practical usefulness.
Abstract
Matching is a task at the heart of any data integration process, aimed at identifying correspondences among data elements. Matching problems were traditionally solved in a semi-automatic manner, with correspondences being generated by matching algorithms and outcomes subsequently validated by human experts. Human-in-the-loop data integration has been recently challenged by the introduction of big data and recent studies have analyzed obstacles to effective human matching and validation. In this work we characterize human matching experts, those humans whose proposed correspondences can mostly be trusted to be valid. We provide a novel framework for characterizing matching experts that, accompanied with a novel set of features, can be used to identify reliable and valuable human experts. We demonstrate the usefulness of our approach using an extensive empirical evaluation. In particular,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Anomaly Detection Techniques and Applications · Privacy-Preserving Technologies in Data
