SciHorizon-GENE: Benchmarking LLM for Life Sciences Inference from Gene Knowledge to Functional Understanding
Xiaohan Huang, Meng Xiao, Chuan Qin, Qingqing Long, Jinmiao Chen, Yuanchun Zhou, Hengshu Zhu

TL;DR
SciHorizon-GENE is a comprehensive benchmark for evaluating large language models' ability to reason from gene knowledge to functional understanding, revealing significant variability and challenges in biomedical reasoning tasks.
Contribution
The paper introduces SciHorizon-GENE, a large-scale, gene-centric benchmark for assessing LLMs' reasoning in biomedical contexts, focusing on gene-to-function inference and failure modes.
Findings
Significant heterogeneity in LLM reasoning capabilities.
Persistent challenges in faithfulness and completeness.
Insights into model selection for biological interpretation.
Abstract
Large language models (LLMs) have shown growing promise in biomedical research, particularly for knowledge-driven interpretation tasks. However, their ability to reliably reason from gene-level knowledge to functional understanding, a core requirement for knowledge-enhanced cell atlas interpretation, remains largely underexplored. To address this gap, we introduce SciHorizon-GENE, a large-scale gene-centric benchmark constructed from authoritative biological databases. The benchmark integrates curated knowledge for over 190K human genes and comprises more than 540K questions covering diverse gene-to-function reasoning scenarios relevant to cell type annotation, functional interpretation, and mechanism-oriented analysis. Motivated by behavioral patterns observed in preliminary examinations, SciHorizon-GENE evaluates LLMs along four biologically critical perspectives: research attention…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Single-cell and spatial transcriptomics · Bioinformatics and Genomic Networks
