Large language models surpass human experts in predicting neuroscience   results

Xiaoliang Luo; Akilles Rechardt; Guangzhi Sun; Kevin K. Nejad; Felipe; Y\'a\~nez; Bati Yilmaz; Kangjoo Lee; Alexandra O. Cohen; Valentina; Borghesani; Anton Pashkov; Daniele Marinazzo; Jonathan Nicholas; Alessandro; Salatiello; Ilia Sucholutsky; Pasquale Minervini; Sepehr Razavi; Roberta; Rocca; Elkhan Yusifov; Tereza Okalova; Nianlong Gu; Martin Ferianc; Mikail; Khona; Kaustubh R. Patil; Pui-Shee Lee; Rui Mata; Nicholas E. Myers; Jennifer; K Bizley; Sebastian Musslick; Isil Poyraz Bilgin; Guiomar Niso; Justin M.; Ales; Michael Gaebler; N Apurva Ratan Murty; Leyla Loued-Khenissi; Anna; Behler; Chloe M. Hall; Jessica Dafflon; Sherry Dongqi Bao; Bradley C. Love

arXiv:2403.03230·q-bio.NC·December 2, 2024·6 cites

Large language models surpass human experts in predicting neuroscience results

Xiaoliang Luo, Akilles Rechardt, Guangzhi Sun, Kevin K. Nejad, Felipe, Y\'a\~nez, Bati Yilmaz, Kangjoo Lee, Alexandra O. Cohen, Valentina, Borghesani, Anton Pashkov, Daniele Marinazzo, Jonathan Nicholas, Alessandro, Salatiello, Ilia Sucholutsky, Pasquale Minervini, Sepehr Razavi

PDF

Open Access 1 Repo

TL;DR

Large language models trained on scientific literature can outperform human experts in predicting neuroscience results, demonstrating their potential to assist in scientific discovery and knowledge synthesis.

Contribution

This paper introduces BrainBench and BrainGPT, showing that LLMs can surpass experts in neuroscience prediction tasks and are effective in synthesizing complex scientific knowledge.

Findings

01

LLMs outperform experts in predicting neuroscience experimental outcomes

02

Confidence levels in LLMs correlate with prediction accuracy

03

The approach is transferable to other scientific fields

Abstract

Scientific discoveries often hinge on synthesizing decades of research, a task that potentially outstrips human information processing capacities. Large language models (LLMs) offer a solution. LLMs trained on the vast scientific literature could potentially integrate noisy yet interrelated findings to forecast novel results better than human experts. To evaluate this possibility, we created BrainBench, a forward-looking benchmark for predicting neuroscience results. We find that LLMs surpass experts in predicting experimental outcomes. BrainGPT, an LLM we tuned on the neuroscience literature, performed better yet. Like human experts, when LLMs were confident in their predictions, they were more likely to be correct, which presages a future where humans and LLMs team together to make discoveries. Our approach is not neuroscience-specific and is transferable to other knowledge-intensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

braingpt-lovelab/brainbench
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Machine Learning in Bioinformatics