When AI Does Science: Evaluating the Autonomous AI Scientist KOSMOS in Radiation Biology
Humza Nusrat, Omar Nusrat

TL;DR
This paper evaluates the autonomous AI scientist KOSMOS in radiation biology, demonstrating its ability to generate useful hypotheses and highlighting the importance of rigorous null model testing for AI-driven scientific discovery.
Contribution
The study provides a systematic evaluation of KOSMOS's hypothesis generation in radiation biology, emphasizing the need for rigorous validation of AI-generated scientific ideas.
Findings
CDO1 gene expression strongly predicts radiation response
KOSMOS identified a gene signature with moderate prognostic value
Some hypotheses generated by KOSMOS were false or uncertain
Abstract
Agentic AI "scientists" now use language models to search the literature, run analyses, and generate hypotheses. We evaluate KOSMOS, an autonomous AI scientist, on three problems in radiation biology using simple random-gene null benchmarks. Hypothesis 1: baseline DNA damage response (DDR) capacity across cell lines predicts the p53 transcriptional response after irradiation (GSE30240). Hypothesis 2: baseline expression of OGT and CDO1 predicts the strength of repressed and induced radiation-response modules in breast cancer cells (GSE59732). Hypothesis 3: a 12-gene expression signature predicts biochemical recurrence-free survival after prostate radiotherapy plus androgen deprivation therapy (GSE116918). The DDR-p53 hypothesis was not supported: DDR score and p53 response were weakly negatively correlated (Spearman rho = -0.40, p = 0.76), indistinguishable from random five-gene scores.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI) · Radiomics and Machine Learning in Medical Imaging
