Liberal Entity Matching as a Compound AI Toolchain
Silvery D. Fu, David Wang, Wen Zhang, Kathleen Ge

TL;DR
Libem is a modular, tool-oriented AI system for entity matching that dynamically adapts and refines its process, overcoming limitations of static knowledge and rigidity in traditional LLM-based methods.
Contribution
The paper introduces Libem, a flexible, composable AI toolchain for entity matching that supports dynamic tool use, self-refinement, and optimization, enhancing adaptability and modularity.
Findings
Libem demonstrates improved accuracy over static LLM approaches.
The system enables iterative refinement and customization for different datasets.
Libem's modular design facilitates system improvements and integration.
Abstract
Entity matching (EM), the task of identifying whether two descriptions refer to the same entity, is essential in data management. Traditional methods have evolved from rule-based to AI-driven approaches, yet current techniques using large language models (LLMs) often fall short due to their reliance on static knowledge and rigid, predefined prompts. In this paper, we introduce Libem, a compound AI system designed to address these limitations by incorporating a flexible, tool-oriented approach. Libem supports entity matching through dynamic tool use, self-refinement, and optimization, allowing it to adapt and refine its process based on the dataset and performance metrics. Unlike traditional solo-AI EM systems, which often suffer from a lack of modularity that hinders iterative design improvements and system optimization, Libem offers a composable and reusable toolchain. This approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Topic Modeling · Natural Language Processing Techniques
