Using Elasticsearch for entity recognition in affiliation disambiguation
Anne L'H\^ote, Eric Jeangirard

TL;DR
This paper presents a modular Elasticsearch-based method for automatic affiliation recognition in scholarly metadata, enabling customizable precision and recall in aligning with registries like countries, GRID.ac, and RNSR.
Contribution
It introduces a flexible, user-controlled alignment approach using Elasticsearch for affiliation disambiguation in scholarly data.
Findings
Effective alignment on three registries demonstrated
Method allows user-defined precision and recall
Open-source implementation available on Github
Abstract
Automatic recognition of affiliations in the metadata of scholarly publications is a key point for monitoring and analyzing trends in scientific production, especially in an open science context. We propose an automatic alignment method on registries, based on Elasticsearch. The proposed method is modular and leaves the choice of the alignment criteria to the user, allowing him to keep control over the precision and recall of the method. An implementation is proposed for an automatic alignment on three registries: countries, GRID.ac and RNSR (research laboratory directory in France) on the Github https://github.com/dataesr/matcher and the performances are analyzed in this paper.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Semantic Web and Ontologies · Data Mining Algorithms and Applications
