Scholar Name Disambiguation with Search-enhanced LLM Across Language

Renyu Zhao; Yunxin Chen

arXiv:2411.17102·cs.IR·March 5, 2025·3 cites

Scholar Name Disambiguation with Search-enhanced LLM Across Language

Renyu Zhao, Yunxin Chen

PDF

Open Access

TL;DR

This paper introduces a search-enhanced, multilingual LLM-based approach for scholar name disambiguation, significantly improving accuracy by leveraging search engine capabilities and cross-language features.

Contribution

It presents a novel method combining search engine techniques with large language models to enhance scholar name disambiguation across multiple languages.

Findings

01

Improved disambiguation accuracy with local language data

02

Effective use of search engine capabilities for data enrichment

03

Enhanced performance across diverse geographic regions

Abstract

The task of scholar name disambiguation is crucial in various real-world scenarios, including bibliometric-based candidate evaluation for awards, application material anti-fraud measures, and more. Despite significant advancements, current methods face limitations due to the complexity of heterogeneous data, often necessitating extensive human intervention. This paper proposes a novel approach by leveraging search-enhanced language models across multiple languages to improve name disambiguation. By utilizing the powerful query rewriting, intent recognition, and data indexing capabilities of search engines, our method can gather richer information for distinguishing between entities and extracting profiles, resulting in a more comprehensive data dimension. Given the strong cross-language capabilities of large language models(LLMs), optimizing enhanced retrieval methods with this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Biomedical Text Mining and Ontologies · Library Science and Information Systems