OpenSanctions Pairs: Large-Scale Entity Matching with LLMs

Chandler Smith; Magnus Sesodia; Friedrich Lindenberg; Christian Schroeder de Witt

arXiv:2603.11051·cs.IR·March 13, 2026

OpenSanctions Pairs: Large-Scale Entity Matching with LLMs

Chandler Smith, Magnus Sesodia, Friedrich Lindenberg, Christian Schroeder de Witt

PDF

Open Access

TL;DR

This paper introduces OpenSanctions Pairs, a large-scale benchmark for entity matching, demonstrating that large language models significantly outperform traditional rule-based systems in complex, multilingual, real-world data matching tasks.

Contribution

It provides a new extensive dataset for entity matching and benchmarks LLMs against traditional methods, showing the superior performance of LLMs in this domain.

Findings

01

LLMs achieve up to 98.95% F1 score, outperforming rule-based systems.

02

Adding in-context examples offers minimal gains and can reduce performance.

03

Error analysis reveals different failure modes for rule-based and LLM methods.

Abstract

We release OpenSanctions Pairs, a large-scale entity matching benchmark derived from real-world international sanctions aggregation and analyst deduplication. The dataset contains 755,540 labeled pairs spanning 293 heterogeneous sources across 31 countries, with multilingual and cross-script names, noisy and missing attributes, and set-valued fields typical of compliance workflows. We benchmark a production rule-based matcher (nomenklatura RegressionV1 algorithm) against open- and closed-source LLMs in zero- and few-shot settings. Off-the-shelf LLMs substantially outperform the production rule-based baseline (91.33\% F1), reaching up to 98.95\% F1 (GPT-4o) and 98.23\% F1 with a locally deployable open model (DeepSeek-R1-Distill-Qwen-14B). DSPy MIPROv2 prompt optimization yields consistent but modest gains, while adding in-context examples provides little additional benefit and can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Topic Modeling · Authorship Attribution and Profiling