ChatGPT as Research Scientist: Probing GPT's Capabilities as a Research Librarian, Research Ethicist, Data Generator and Data Predictor
Steven A. Lehr, Aylin Caliskan, Suneragiri Liyanage, Mahzarin R., Banaji

TL;DR
This study systematically evaluates GPT-3.5 and GPT-4's capabilities across key scientific roles, revealing strengths in ethics and data generation but limitations in predicting novel empirical results, with rapid improvements observed.
Contribution
It provides a comprehensive assessment of GPT's performance in scientific tasks, highlighting its current strengths and weaknesses across multiple research-related functions.
Findings
GPT-4 better at acknowledging fiction than GPT-3.5
GPT-4 detects some research violations with high accuracy
Models replicate known cultural biases in data generation
Abstract
How good a research scientist is ChatGPT? We systematically probed the capabilities of GPT-3.5 and GPT-4 across four central components of the scientific process: as a Research Librarian, Research Ethicist, Data Generator, and Novel Data Predictor, using psychological science as a testing field. In Study 1 (Research Librarian), unlike human researchers, GPT-3.5 and GPT-4 hallucinated, authoritatively generating fictional references 36.0% and 5.4% of the time, respectively, although GPT-4 exhibited an evolving capacity to acknowledge its fictions. In Study 2 (Research Ethicist), GPT-4 (though not GPT-3.5) proved capable of detecting violations like p-hacking in fictional research protocols, correcting 88.6% of blatantly presented issues, and 72.6% of subtly presented issues. In Study 3 (Data Generator), both models consistently replicated patterns of cultural bias previously discovered…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · COVID-19 diagnosis using AI · Machine Learning in Healthcare
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Absolute Position Encodings · Label Smoothing · Cosine Annealing · Position-Wise Feed-Forward Layer · Linear Layer · Residual Connection · Discriminative Fine-Tuning
