Separating the articles of authors with the same name

Jose M. Soler

arXiv:cs/0608004·cs.DL·May 23, 2007·Scientometrics

Separating the articles of authors with the same name

Jose M. Soler

PDF

Open Access

TL;DR

This paper presents a clustering method based on probabilistic distances to distinguish articles of authors sharing the same name, aiding citation analysis without requiring complete publication lists.

Contribution

It introduces a novel distance-based clustering approach to separate authors with identical names, improving accuracy in author disambiguation tasks.

Findings

01

Effective clustering of articles by author identity

02

Simplifies citation analysis processes

03

Useful for authors with common names

Abstract

I describe a method to separate the articles of different authors with the same name. It is based on a distance between any two publications, defined in terms of the probability that they would have as many coincidences if they were drawn at random from all published documents. Articles with a given author name are then clustered according to their distance, so that all articles in a cluster belong very likely to the same author. The method has proven very useful in generating groups of papers that are then selected manually. This simplifies considerably citation analysis when the author publication lists are not available.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Biomedical Text Mining and Ontologies · Data Mining Algorithms and Applications