Whole Genome Transformer for Gene Interaction Effects in Microbiome Habitat Specificity
Zhufeng Li, Sandeep S Cranganore, Nicholas Youngblut, Niki, Kilbertus

TL;DR
This paper introduces a genome transformer model that predicts microbiome habitat specificity and uncovers gene interactions driving microbial adaptation, combining high predictive accuracy with interpretability for biological insights.
Contribution
It presents a novel transformer-based framework for genome-wide gene interaction analysis in microbiomes, enhancing understanding of habitat adaptation.
Findings
Solid predictive performance on microbiome genomes
Identification of known and new gene interaction networks
Sequence-level genome analysis reveals complex phenotype associations
Abstract
Leveraging the vast genetic diversity within microbiomes offers unparalleled insights into complex phenotypes, yet the task of accurately predicting and understanding such traits from genomic data remains challenging. We propose a framework taking advantage of existing large models for gene vectorization to predict habitat specificity from entire microbial genome sequences. Based on our model, we develop attribution techniques to elucidate gene interaction effects that drive microbial adaptation to diverse environments. We train and validate our approach on a large dataset of high quality microbiome genomes from different habitats. We not only demonstrate solid predictive performance, but also how sequence-level information of entire genomes allows us to identify gene associations underlying complex phenotypes. Our attribution recovers known important interaction networks and proposes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsCRISPR and Genetic Engineering
