Practical Author Name Disambiguation under Metadata Constraints: A Contrastive Learning Approach for Astronomy Literature
Vicente Amado Olivo, Wolfgang Kerzendorf, Bangjing Lu, Joshua V. Shields, Andreas Fl\"ors, and Nutan Chen

TL;DR
This paper presents a neural network-based method for author name disambiguation in astronomy literature that performs well even with limited metadata, addressing a key challenge in digital libraries.
Contribution
It introduces the Neural Author Name Disambiguator, a contrastive learning approach using a Siamese neural network that disambiguates authors with minimal metadata, and provides a new benchmark dataset.
Findings
Achieves up to 94% accuracy in pairwise disambiguation.
Over 95% F1 score in clustering publications.
Effective disambiguation with limited metadata.
Abstract
The ability to distinctly and properly collate an individual researcher's publications is crucial for ensuring appropriate recognition, guiding the allocation of research funding and informing hiring decisions. However, accurately grouping and linking a researcher's entire body of work with their individual identity is challenging because of widespread name ambiguity across the growing literature. Algorithmic author name disambiguation provides a scalable approach to disambiguating author identities, yet existing methods have limitations. Many modern author name disambiguation methods rely on comprehensive metadata features such as venue or affiliation. Despite advancements in digitally indexing publications, metadata is often unavailable or inconsistent in large digital libraries(e.g. NASA/ADS). We introduce the Neural Author Name Disambiguator, a method that disambiguates author…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Library Science and Information Systems · Authorship Attribution and Profiling
