Learning from Natural Language Explanations for Generalizable Entity Matching
Somin Wadhwa, Adit Krishnan, Runhui Wang, Byron C. Wallace, Chris Kong

TL;DR
This paper proposes a novel approach to entity matching by using natural language explanations to distill large language model reasoning into smaller, more efficient models, improving out-of-domain generalization.
Contribution
It introduces a conditional generation framework for entity matching that leverages natural language explanations to enhance model robustness and generalization, especially in out-of-domain scenarios.
Findings
Achieves 10.85% F-1 improvement on out-of-domain tests.
Explanations significantly improve model performance and robustness.
Distilling LLM reasoning enables efficient, scalable entity matching.
Abstract
Entity matching is the task of linking records from different sources that refer to the same real-world entity. Past work has primarily treated entity linking as a standard supervised learning problem. However, supervised entity matching models often do not generalize well to new data, and collecting exhaustive labeled training data is often cost prohibitive. Further, recent efforts have adopted LLMs for this task in few/zero-shot settings, exploiting their general knowledge. But LLMs are prohibitively expensive for performing inference at scale for real-world entity matching tasks. As an efficient alternative, we re-cast entity matching as a conditional generation task as opposed to binary classification. This enables us to "distill" LLM reasoning into smaller entity matching models via natural language explanations. This approach achieves strong performance, especially on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Data Quality and Management
