Traditional statistical representations outperform generative AI in identifying expert peer reviewers
Vicente Amado Olivo, Tereza Jerabkova, Jakub Klencki, John Carpenter, Mario Mali\v{c}ki, Ferdinando Patat, Louis-Gregory Strolger, and Wolfgang Kerzendorf

TL;DR
Traditional statistical methods outperform large language models like GPT-4 in accurately identifying domain experts for peer review, highlighting the importance of fine-grained vocabulary in expertise detection.
Contribution
This study provides a rigorous empirical evaluation comparing statistical and AI-driven methods for expert identification, demonstrating the superiority of traditional statistical representations.
Findings
Term Frequency-Inverse Document Frequency identified experts 79.5% of the time within top 25 recommendations.
GPT-4o mini achieved 51.5% success in expert identification.
Statistical methods outperform generative AI in specialized scientific tasks.
Abstract
The exponential growth of scientific submissions has strained the peer review system. Despite the rapidly expanding global pool of researchers, this unprecedented scale has rendered the previous approach of manual expert identification unfeasible. Therefore, institutions have naturally turned to Large Language Models (LLMs) to automate intricate processes like expert reviewer identification. However, the reliability of these new models in accurately identifying domain experts lacks rigorous evaluation. We conduct a comprehensive empirical evaluation of statistical and AI-driven expertise identification methodologies to benchmark their reliability and limitations. Framing expert identification as an information retrieval problem, we utilize the distributed peer review system of a major international astronomical observatory, where proposal authorship serves as our proxy ground truth for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
