A Bayesian Learning, Greedy agglomerative clustering approach and   evaluation techniques for Author Name Disambiguation Problem

Shashwat Sourav

arXiv:2211.01303·cs.DL·November 3, 2022

A Bayesian Learning, Greedy agglomerative clustering approach and evaluation techniques for Author Name Disambiguation Problem

Shashwat Sourav

PDF

Open Access

TL;DR

This paper reviews author name disambiguation techniques, focusing on Bayesian learning and greedy agglomerative clustering, and evaluates their effectiveness on large real-world databases.

Contribution

It introduces a Bayesian and greedy agglomerative clustering approach for author disambiguation and discusses evaluation techniques and future research directions.

Findings

01

Effective disambiguation on large databases

02

Bayesian and greedy methods improve accuracy

03

Provides comprehensive review and future outlook

Abstract

Author names often suffer from ambiguity owing to the same author appearing under different names and multiple authors possessing similar names. It creates difficulty in associating a scholarly work with the person who wrote it, thereby introducing inaccuracy in credit attribution, bibliometric analysis, search-by-author in a digital library, and expert discovery. A plethora of techniques for disambiguation of author names has been proposed in the literature. I try to focus on the research efforts targeted to disambiguate author names. I first go through the conventional methods, then I discuss evaluation techniques and the clustering model which finally leads to the Bayesian learning and Greedy agglomerative approach. I believe this concentrated review will be useful for the research community because it discusses techniques applied to a very large real database that is actively used…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Biomedical Text Mining and Ontologies · Topic Modeling