Can Large Language Models Evaluate Grant Proposal Quality? Revisiting the Wenner{\aa}s and Wold Peer Review Data

Ulf Sandstr\"om; Mike Thelwall

arXiv:2603.14565·cs.DL·March 17, 2026

Can Large Language Models Evaluate Grant Proposal Quality? Revisiting the Wenner{\aa}s and Wold Peer Review Data

Ulf Sandstr\"om, Mike Thelwall

PDF

Open Access

TL;DR

This study evaluates the effectiveness of large language models in scoring grant proposals by comparing their assessments to expert peer review scores, revealing moderate correlations and potential practical uses despite weaker performance.

Contribution

First assessment of LLMs for scoring grant proposals, comparing their scores with expert reviews using historical data, and analyzing their potential role in funding decisions.

Findings

01

LLMs show moderate correlation among themselves (mean Spearman 0.34).

02

LLMs weakly correlate with expert scores (mean Spearman 0.22).

03

Best LLM correlation with experts was 0.33, about half of reviewer agreement.

Abstract

Purpose: Despite the importance of peer review for grant funding decisions, academics are often reluctant to conduct it. This can lead to long delays between submission and the final decision as well as the risk of substandard reviews from busy or non-specialist scholars. At least one funder now uses Large Language Models (LLMs) to reduce the reviewing burden but the accuracy of LLMs for scoring grant proposals needs to be assessed. Design/methodology/approach: This article compares scores from a range of medium sized open weights LLMs with peer review scores for a well-researched dataset, the Swedish Medical Council's post-doctoral fellowship applications from 1994. Findings: Whilst the LLM scores correlate moderately between each other (mean Spearman correlation: 0.34), they correlated weakly but positively and mostly statistically significantly with the average expert scores (mean…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topicsscientometrics and bibliometrics research · Academic Publishing and Open Access · Meta-analysis and systematic reviews