OmniGenBench: A Modular Platform for Reproducible Genomic Foundation Models Benchmarking
Heng Yang, Jack Cole, Yuan Li, Renzhi Chen, Geyong Min, Ke Li

TL;DR
OmniGenBench is a modular, reproducible benchmarking platform that standardizes evaluation of genomic foundation models, addressing key challenges in data transparency, interoperability, and interpretability to advance genomic AI research.
Contribution
It introduces a unified, extensible platform for benchmarking GFMs, enabling consistent evaluation and fostering reproducibility in genomic AI research.
Findings
Supports over 31 open-source models
Provides automated, one-command evaluation
Addresses reproducibility and interpretability challenges
Abstract
The code of nature, embedded in DNA and RNA genomes since the origin of life, holds immense potential to impact both humans and ecosystems through genome modeling. Genomic Foundation Models (GFMs) have emerged as a transformative approach to decoding the genome. As GFMs scale up and reshape the landscape of AI-driven genomics, the field faces an urgent need for rigorous and reproducible evaluation. We present OmniGenBench, a modular benchmarking platform designed to unify the data, model, benchmarking, and interpretability layers across GFMs. OmniGenBench enables standardized, one-command evaluation of any GFM across five benchmark suites, with seamless integration of over 31 open-source models. Through automated pipelines and community-extensible features, the platform addresses critical reproducibility challenges, including data transparency, model interoperability, benchmark…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Single-cell and spatial transcriptomics · Genomics and Rare Diseases
