Traditional statistical representations outperform generative AI in identifying expert peer reviewers

Vicente Amado Olivo; Tereza Jerabkova; Jakub Klencki; John Carpenter; Mario Mali\v{c}ki; Ferdinando Patat; Louis-Gregory Strolger; and Wolfgang Kerzendorf

arXiv:2605.18752·cs.IR·May 19, 2026

Traditional statistical representations outperform generative AI in identifying expert peer reviewers

Vicente Amado Olivo, Tereza Jerabkova, Jakub Klencki, John Carpenter, Mario Mali\v{c}ki, Ferdinando Patat, Louis-Gregory Strolger, and Wolfgang Kerzendorf

PDF

TL;DR

Traditional statistical methods outperform large language models like GPT-4 in accurately identifying domain experts for peer review, highlighting the importance of fine-grained vocabulary in expertise detection.

Contribution

This study provides a rigorous empirical evaluation comparing statistical and AI-driven methods for expert identification, demonstrating the superiority of traditional statistical representations.

Findings

01

Term Frequency-Inverse Document Frequency identified experts 79.5% of the time within top 25 recommendations.

02

GPT-4o mini achieved 51.5% success in expert identification.

03

Statistical methods outperform generative AI in specialized scientific tasks.

Abstract

The exponential growth of scientific submissions has strained the peer review system. Despite the rapidly expanding global pool of researchers, this unprecedented scale has rendered the previous approach of manual expert identification unfeasible. Therefore, institutions have naturally turned to Large Language Models (LLMs) to automate intricate processes like expert reviewer identification. However, the reliability of these new models in accurately identifying domain experts lacks rigorous evaluation. We conduct a comprehensive empirical evaluation of statistical and AI-driven expertise identification methodologies to benchmark their reliability and limitations. Framing expert identification as an information retrieval problem, we utilize the distributed peer review system of a major international astronomical observatory, where proposal authorship serves as our proxy ground truth for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.