ChatGPT as Research Scientist: Probing GPT's Capabilities as a Research   Librarian, Research Ethicist, Data Generator and Data Predictor

Steven A. Lehr; Aylin Caliskan; Suneragiri Liyanage; Mahzarin R.; Banaji

arXiv:2406.14765·cs.AI·June 24, 2024·5 cites

ChatGPT as Research Scientist: Probing GPT's Capabilities as a Research Librarian, Research Ethicist, Data Generator and Data Predictor

Steven A. Lehr, Aylin Caliskan, Suneragiri Liyanage, Mahzarin R., Banaji

PDF

Open Access

TL;DR

This study systematically evaluates GPT-3.5 and GPT-4's capabilities across key scientific roles, revealing strengths in ethics and data generation but limitations in predicting novel empirical results, with rapid improvements observed.

Contribution

It provides a comprehensive assessment of GPT's performance in scientific tasks, highlighting its current strengths and weaknesses across multiple research-related functions.

Findings

01

GPT-4 better at acknowledging fiction than GPT-3.5

02

GPT-4 detects some research violations with high accuracy

03

Models replicate known cultural biases in data generation

Abstract

How good a research scientist is ChatGPT? We systematically probed the capabilities of GPT-3.5 and GPT-4 across four central components of the scientific process: as a Research Librarian, Research Ethicist, Data Generator, and Novel Data Predictor, using psychological science as a testing field. In Study 1 (Research Librarian), unlike human researchers, GPT-3.5 and GPT-4 hallucinated, authoritatively generating fictional references 36.0% and 5.4% of the time, respectively, although GPT-4 exhibited an evolving capacity to acknowledge its fictions. In Study 2 (Research Ethicist), GPT-4 (though not GPT-3.5) proved capable of detecting violations like p-hacking in fictional research protocols, correcting 88.6% of blatantly presented issues, and 72.6% of subtly presented issues. In Study 3 (Data Generator), both models consistently replicated patterns of cultural bias previously discovered…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · COVID-19 diagnosis using AI · Machine Learning in Healthcare

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Absolute Position Encodings · Label Smoothing · Cosine Annealing · Position-Wise Feed-Forward Layer · Linear Layer · Residual Connection · Discriminative Fine-Tuning