GPT vs Human for Scientific Reviews: A Dual Source Review on   Applications of ChatGPT in Science

Chenxi Wu; Alan John Varghese; Vivek Oommen; George Em Karniadakis

arXiv:2312.03769·cs.CL·December 8, 2023·1 cites

GPT vs Human for Scientific Reviews: A Dual Source Review on Applications of ChatGPT in Science

Chenxi Wu, Alan John Varghese, Vivek Oommen, George Em Karniadakis

PDF

Open Access

TL;DR

This study compares GPT-based models and human reviewers in scientific review tasks, revealing that GPT-4 aligns closely with humans in accuracy and structure, but still faces limitations in understanding complex methodologies and ethical considerations.

Contribution

It provides a comprehensive evaluation of GPT models' performance in scientific reviews, highlighting their strengths and current limitations compared to human reviewers.

Findings

01

50% of SciSpace responses align with human reviews on objective questions

02

GPT-4 often rates human reviews higher in accuracy

03

SciSpace scores higher in structure, clarity, and completeness

Abstract

The new polymath Large Language Models (LLMs) can speed-up greatly scientific reviews, possibly using more unbiased quantitative metrics, facilitating cross-disciplinary connections, and identifying emerging trends and research gaps by analyzing large volumes of data. However, at the present time, they lack the required deep understanding of complex methodologies, they have difficulty in evaluating innovative claims, and they are unable to assess ethical issues and conflicts of interest. Herein, we consider 13 GPT-related papers across different scientific domains, reviewed by a human reviewer and SciSpace, a large language model, with the reviews evaluated by three distinct types of evaluators, namely GPT-3.5, a crowd panel, and GPT-4. We found that 50% of SciSpace's responses to objective questions align with those of a human reviewer, with GPT-4 (informed evaluator) often rating the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI) · Topic Modeling

Methods{Dispute@FaQ-s}How to file a dispute with Expedia? · Attention Is All You Need · Linear Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · Softmax · Cosine Annealing · Multi-Head Attention · Adam · Absolute Position Encodings · Layer Normalization