Modelling gene content across a phylogeny to determine when genes become associated
Jiahao Diao, Malgorzata M. O'Reilly, Barbara R. Holland

TL;DR
This paper introduces a stochastic model to infer when genes become functionally associated during evolution by analyzing gene gain and loss patterns across species, using statistical methods to detect shifts in gene interaction.
Contribution
It presents a novel stochastic modeling approach for detecting the timing and occurrence of gene associations in evolutionary history based on gene presence-absence data.
Findings
Model can distinguish between independent and dependent gene gain/loss rates.
Detection of rate shifts depends on tree size and magnitude of rate change.
AIC effectively supports models with multiple rate classes when appropriate.
Abstract
In this work, we develop a stochastic model of gene gain and loss with the aim of inferring when (if at all) in evolutionary history and association between two genes arises. The data we consider is a species tree along with information on the presence or absence of two genes in each of the species. The biological motivation for our model is that if two genes are involved in the same biochemical pathway, i.e. they are both required for some function, then the rate of gain or loss of one gene in the pathway should depend upon the presence or absence of the other gene in the pathway. However, if the two genes are not functionally linked, then the rate of gain or loss of one gene should be independent of the state of another gene. We simulate data under this model to determine under what conditions a shift from the independent rates class to the dependent rates class can be detected. For…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBioinformatics and Genomic Networks · Gene expression and cancer classification · Genetic Associations and Epidemiology
