Bayesian Analysis of Partitioned Data
Brian R. Moore, Jim McGuire, Fredrik Ronquist, and John P. Huelsenbeck

TL;DR
This paper introduces a Bayesian method using Dirichlet process priors to model process heterogeneity in phylogenetic data, allowing for flexible, data-driven partitioning that improves inference robustness.
Contribution
It proposes a novel Bayesian approach that treats data partitioning as a random variable, enhancing phylogenetic analysis by integrating over all possible partitions.
Findings
Discoveries of novel process partitions that improve model fit.
Enhanced robustness of phylogenetic inference to heterogeneity.
Comparison shows advantages over traditional fixed partition methods.
Abstract
Variation in the evolutionary process across the sites of nucleotide sequence alignments is well established, and is an increasingly pervasive feature of datasets composed of gene regions sampled from multiple loci and/or different genomes. Inference of phylogeny from these data demands that we adequately model the underlying process heterogeneity; failure to do so can lead to biased estimates of phylogeny and other parameters. Traditionally, process heterogeneity has been accommodated by first assigning sites to data subsets based on relevant prior information (reflecting codon positions in protein-coding DNA, stem and loop regions of ribosomal DNA, etc.), and then estimating the phylogeny and other model parameters under the resulting mixed model. Here, we consider an alternative approach for accommodating process heterogeneity that is similar in spirit to this conventional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Genome Rearrangement Algorithms · Gene expression and cancer classification
