GraLMatch: Matching Groups of Entities with Graphs and Language Models
Fernando De Meer Pardo, Claude Lehmann, Dennis Gehrig, Andrea Nagy,, Stefano Nicoli, Branka Hadji Misheva, Martin Braschler, Kurt Stockinger

TL;DR
This paper introduces GraLMatch, a graph-based method for entity group matching across multiple data sources, emphasizing the importance of transitive information and fine-tuning language models for improved accuracy.
Contribution
It presents a novel approach combining graph properties and Transformer fine-tuning to enhance multi-source entity group matching, along with new benchmark datasets.
Findings
Considering transitive matches improves group accuracy.
Graph-based filtering reduces false positives.
Fine-tuning DistilBERT enhances matching precision.
Abstract
In this paper, we present an end-to-end multi-source Entity Matching problem, which we call entity group matching, where the goal is to assign to the same group, records originating from multiple data sources but representing the same real-world entity. We focus on the effects of transitively matched records, i.e. the records connected by paths in the graph G = (V,E) whose nodes and edges represent the records and whether they are a match or not. We present a real-world instance of this problem, where the challenge is to match records of companies and financial securities originating from different data providers. We also introduce two new multi-source benchmark datasets that present similar matching challenges as real-world records. A distinctive characteristic of these records is that they are regularly updated following real-world events, but updates are not applied uniformly across…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Graph Neural Networks
MethodsFocus
