Research evaluation with ChatGPT: Is it age, country, length, or field   biased?

Mike Thelwall; Zeyneb Kurt

arXiv:2411.09768·cs.DL·November 18, 2024·Scientometrics

Research evaluation with ChatGPT: Is it age, country, length, or field biased?

Mike Thelwall, Zeyneb Kurt

PDF

Open Access

TL;DR

This study examines biases in ChatGPT's research quality assessments, revealing influences from publication year, field, and abstract length, and emphasizes the need for normalization for accurate evaluations.

Contribution

It provides empirical evidence of biases in ChatGPT's research evaluation scores across fields, years, and abstract lengths, and suggests normalization methods for better accuracy.

Findings

01

Scores increased over publication years, not due to author nationality or abstract length.

02

Significant variation in scores across different research fields and countries.

03

Longer abstracts tend to receive higher scores, likely reflecting article quality.

Abstract

Some research now suggests that ChatGPT can estimate the quality of journal articles from their titles and abstracts. This has created the possibility to use ChatGPT quality scores, perhaps alongside citation-based formulae, to support peer review for research evaluation. Nevertheless, ChatGPT's internal processes are effectively opaque, despite it writing a report to support its scores, and its biases are unknown. This article investigates whether publication date and field are biasing factors. Based on submitting a monodisciplinary journal-balanced set of 117,650 articles from 26 fields published in the years 2003, 2008, 2013, 2018 and 2023 to ChatGPT 4o-mini, the results show that average scores increased over time, and this was not due to author nationality or title and abstract length changes. The results also varied substantially between fields, and first author countries. In…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Machine Learning in Healthcare