Structured Multi-Step Reasoning for Entity Matching Using Large Language Model

Rohan Bopardikar; Jin Wang; Jia Zou

arXiv:2511.22832·cs.DB·December 1, 2025

Structured Multi-Step Reasoning for Entity Matching Using Large Language Model

Rohan Bopardikar, Jin Wang, Jia Zou

PDF

Open Access

TL;DR

This paper introduces a structured multi-step reasoning framework using large language models for entity matching, decomposing the task into explicit stages to improve accuracy and robustness in data integration tasks.

Contribution

It proposes a novel three-step reasoning process and debate-based strategy for LLMs to enhance entity matching performance over existing single-step methods.

Findings

01

Structured reasoning improves matching accuracy in several datasets.

02

Debate strategy enhances decision robustness.

03

Highlights challenges and future opportunities in reasoning-guided LLMs.

Abstract

Entity matching is a fundamental task in data cleaning and data integration. With the rapid adoption of large language models (LLMs), recent studies have explored zero-shot and few-shot prompting to improve entity matching accuracy. However, most existing approaches rely on single-step prompting and offer limited investigation into structured reasoning strategies. In this work, we investigate how to enhance LLM-based entity matching by decomposing the matching process into multiple explicit reasoning stages. We propose a three-step framework that first identifies matched and unmatched tokens between two records, then determines the attributes most influential to the matching decision, and finally predicts whether the records refer to the same real-world entity. In addition, we explore a debate-based strategy that contrasts supporting and opposing arguments to improve decision…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Topic Modeling · Advanced Graph Neural Networks