OpenSanctions Pairs: Large-Scale Entity Matching with LLMs
Chandler Smith, Magnus Sesodia, Friedrich Lindenberg, Christian Schroeder de Witt

TL;DR
This paper introduces OpenSanctions Pairs, a large-scale benchmark for entity matching, demonstrating that large language models significantly outperform traditional rule-based systems in complex, multilingual, real-world data matching tasks.
Contribution
It provides a new extensive dataset for entity matching and benchmarks LLMs against traditional methods, showing the superior performance of LLMs in this domain.
Findings
LLMs achieve up to 98.95% F1 score, outperforming rule-based systems.
Adding in-context examples offers minimal gains and can reduce performance.
Error analysis reveals different failure modes for rule-based and LLM methods.
Abstract
We release OpenSanctions Pairs, a large-scale entity matching benchmark derived from real-world international sanctions aggregation and analyst deduplication. The dataset contains 755,540 labeled pairs spanning 293 heterogeneous sources across 31 countries, with multilingual and cross-script names, noisy and missing attributes, and set-valued fields typical of compliance workflows. We benchmark a production rule-based matcher (nomenklatura RegressionV1 algorithm) against open- and closed-source LLMs in zero- and few-shot settings. Off-the-shelf LLMs substantially outperform the production rule-based baseline (91.33\% F1), reaching up to 98.95\% F1 (GPT-4o) and 98.23\% F1 with a locally deployable open model (DeepSeek-R1-Distill-Qwen-14B). DSPy MIPROv2 prompt optimization yields consistent but modest gains, while adding in-context examples provides little additional benefit and can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Topic Modeling · Authorship Attribution and Profiling
