ASPEN: ASP-Based System for Collective Entity Resolution
Zhiliang Xiang, Meghyn Bienvenu, Gianluca Cima, V\'ictor, Guti\'errez-Basulto, Yazm\'in Ib\'a\~nez-Garc\'ia

TL;DR
ASPEN introduces an efficient ASP-based framework for collective entity resolution, addressing practical challenges and demonstrating high accuracy and insightful performance analysis on real-world datasets.
Contribution
The paper develops new ASP encodings and variants, including Datalog approximations, to improve the efficiency and effectiveness of collective entity resolution.
Findings
High accuracy achieved on real-world datasets
Different encoding variants impact performance and accuracy
Recursion and approximation methods influence solution quality
Abstract
In this paper, we present ASPEN, an answer set programming (ASP) implementation of a recently proposed declarative framework for collective entity resolution (ER). While an ASP encoding had been previously suggested, several practical issues had been neglected, most notably, the question of how to efficiently compute the (externally defined) similarity facts that are used in rule bodies. This leads us to propose new variants of the encodings (including Datalog approximations) and show how to employ different functionalities of ASP solvers to compute (maximal) solutions, and (approximations of) the sets of possible and certain merges. A comprehensive experimental evaluation of ASPEN on real-world datasets shows that the approach is promising, achieving high accuracy in real-life ER scenarios. Our experiments also yield useful insights into the relative merits of different types of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Network Security and Intrusion Detection · Advanced Database Systems and Queries
