Scientific Hypothesis Generation by a Large Language Model: Laboratory Validation in Breast Cancer Treatment
Abbi Abdel-Rehim, Hector Zenil, Oghenejokpeme Orhobor, Marie Fisher,, Ross J. Collins, Elizabeth Bourne, Gareth W. Fearnley, Emma Tate, Holly X., Smith, Larisa N. Soldatova, Ross D. King

TL;DR
This study demonstrates that large language models like GPT-4 can generate scientifically valid hypotheses, which can be experimentally validated, specifically identifying promising drug combinations for breast cancer treatment.
Contribution
The paper provides the first laboratory validation of LLM-generated hypotheses in biomedical research, showcasing their potential in hypothesis discovery and experimental testing.
Findings
GPT-4 identified 3 effective drug combinations out of 12 in initial tests.
Generated 3 additional promising drug combinations from initial results.
LLMs can serve as valuable sources of scientific hypotheses for experimental validation.
Abstract
Large language models LLMs have transformed AI and achieved breakthrough performance on a wide range of tasks In science the most interesting application of LLMs is for hypothesis formation A feature of LLMs which results from their probabilistic structure is that the output text is not necessarily a valid inference from the training text These are termed hallucinations and are harmful in many applications In science some hallucinations may be useful novel hypotheses whose validity may be tested by laboratory experiments Here we experimentally test the application of LLMs as a source of scientific hypotheses using the domain of breast cancer treatment We applied the LLM GPT4 to hypothesize novel synergistic pairs of FDA-approved noncancer drugs that target the MCF7 breast cancer cell line relative to the nontumorigenic breast cell line MCF10A In the first round of laboratory experiments…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies
