A Bayesian Learning, Greedy agglomerative clustering approach and evaluation techniques for Author Name Disambiguation Problem
Shashwat Sourav

TL;DR
This paper reviews author name disambiguation techniques, focusing on Bayesian learning and greedy agglomerative clustering, and evaluates their effectiveness on large real-world databases.
Contribution
It introduces a Bayesian and greedy agglomerative clustering approach for author disambiguation and discusses evaluation techniques and future research directions.
Findings
Effective disambiguation on large databases
Bayesian and greedy methods improve accuracy
Provides comprehensive review and future outlook
Abstract
Author names often suffer from ambiguity owing to the same author appearing under different names and multiple authors possessing similar names. It creates difficulty in associating a scholarly work with the person who wrote it, thereby introducing inaccuracy in credit attribution, bibliometric analysis, search-by-author in a digital library, and expert discovery. A plethora of techniques for disambiguation of author names has been proposed in the literature. I try to focus on the research efforts targeted to disambiguate author names. I first go through the conventional methods, then I discuss evaluation techniques and the clustering model which finally leads to the Bayesian learning and Greedy agglomerative approach. I believe this concentrated review will be useful for the research community because it discusses techniques applied to a very large real database that is actively used…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Biomedical Text Mining and Ontologies · Topic Modeling
