TL;DR
This paper proposes a threshold-independent score calibration method for entity matching that reduces bias and maintains accuracy, addressing fairness issues overlooked by traditional threshold-based approaches.
Contribution
It introduces a novel calibration technique using Wasserstein barycenters to mitigate bias in entity matching scores without relying on thresholds.
Findings
Biases in matching scores can be effectively reduced.
Calibration preserves accuracy across datasets.
Threshold-independent fairness improves data cleaning processes.
Abstract
Entity Matching (EM) is a critical task in numerous fields, such as healthcare, finance, and public administration, as it identifies records that refer to the same entity within or across different databases. EM faces considerable challenges, particularly with false positives and negatives. These are typically addressed by generating matching scores and apply thresholds to balance false positives and negatives in various contexts. However, adjusting these thresholds can affect the fairness of the outcomes, a critical factor that remains largely overlooked in current fair EM research. The existing body of research on fair EM tends to concentrate on static thresholds, neglecting their critical impact on fairness. To address this, we introduce a new approach in EM using recent metrics for evaluating biases in score based binary classification, particularly through the lens of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
