On the problem of entity matching and its application in automated settlement of receivables
Lukasz Czekaj, Tomasz Biegus, Robert Kitlowski, Stanislaw Raczynski,, Mateusz Olszewski, Jakub Dziedzic, Pawe{\l} Tomasik, Ryszard Kozera,, Alexander Prokopenya, Robert Olszewski

TL;DR
This paper presents novel entity matching techniques to improve automated receivable settlement in NGOs, significantly increasing recall while maintaining high precision, based on real-world operational data.
Contribution
It introduces new methods like score post processing, cascade, and chain models to enhance entity matching quality in open-world scenarios.
Findings
Recall improved from 78% to over 90% at 99% precision.
Methods are validated on real-world operational data.
Contributes to automated receivable settlement and multilabel classification.
Abstract
This paper covers automated settlement of receivables in non-governmental organizations. We tackle the problem with entity matching techniques. We consider setup, where base algorithm is used for preliminary ranking of matches, then we apply several novel methods to increase matching quality of base algorithm: score post processing, cascade model and chain model. The methods presented here contribute to automated settlement of receivables, entity matching and multilabel classification in open-world scenario. We evaluate our approach on real world operational data which come from company providing settlement of receivables as a service: proposed methods boost recall from 78% (base model) to >90% at precision 99%.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Data Management and Algorithms
MethodsBalanced Selection
