Improved Variational Bayesian Phylogenetic Inference using Mixtures
Oskar Kviman, Ricky Mol\'en, Jens Lagergren

TL;DR
VBPI-Mixtures significantly improves phylogenetic posterior inference by effectively modeling complex tree-topology distributions using mixture learning within a variational Bayesian framework, outperforming previous methods.
Contribution
Introduces VBPI-Mixtures, a novel algorithm that enhances tree-topology posterior approximation in phylogenetics through mixture learning, addressing limitations of existing variational methods.
Findings
Achieves state-of-the-art density estimation on real datasets.
Successfully models multimodal tree-topology distributions.
Outperforms existing methods in accuracy and robustness.
Abstract
We present VBPI-Mixtures, an algorithm designed to enhance the accuracy of phylogenetic posterior distributions, particularly for tree-topology and branch-length approximations. Despite the Variational Bayesian Phylogenetic Inference (VBPI), a leading-edge black-box variational inference (BBVI) framework, achieving remarkable approximations of these distributions, the multimodality of the tree-topology posterior presents a formidable challenge to sampling-based learning techniques such as BBVI. Advanced deep learning methodologies such as normalizing flows and graph neural networks have been explored to refine the branch-length posterior approximation, yet efforts to ameliorate the posterior approximation over tree topologies have been lacking. Our novel VBPI-Mixtures algorithm bridges this gap by harnessing the latest breakthroughs in mixture learning within the BBVI domain. As a…
Peer Reviews
Decision·Submitted to ICLR 2024
The authors have motivated the problem quite well in terms of why they chose to combine mixture-based VI methods with subsplit Bayesian networks for modeling phylogenetic data. The mathematical derivations seem sound and the authors seem to be well aware of recent advances in VI for estimating stable gradients which tends to be very important. I fully understood the paper and all the key points made without any background in phylogenetics. Although I did have to read some primers on phylogenet
The work seems to lack in technical depth. The main gradient update equations for VI which are the crux of this paper follow quite naturally from prior work on mixtures in VI. Equation 6, for example follows directly from prior work. All of the observations made in this paper about the advantage of using a mixture in VI are from prior work. The derivation in equation 9 does seem somewhat new and this is perhaps the only thing that I couldn't directly pin on a prior paper. However, I didn't see
- The presentation of the paper is excellent. The appropriate amount of details are given to help the reader understand the method clearly. - The experimental sections are comprehensive, presenting good results with multiple reasonable baselines to compare against.
- The method that the paper presents is a fairly straightforward application of preexisting methods (MISELBO, VIMCO) to a specific class of problems, which suggests a minor lack in the novelty of the work. - The authors put a lot of emphasis on the claim that "the components [of the mixture] jointly explore the tree-topology space." I find that to be a weak statement. If the resulting ELBO is better, one would naturally expect the mixture components to be different, because otherwise it would no
The experimental validation is thorough, and the careful derivation of the VIMCO objectives is sound. The relationship to existing work is also made clear.
The clarity of the paper could be significantly improved. For example, figures refer to DS4, DS7, and DS8 whereas a specific and simpler example might help a reader better understand the method. Similarly, Figure 3 is difficult to read -- perhaps separating the target distribution into a separate plot from the learned approximate posteriors could help clarify this. Further, the motivation and examples (perhaps even Figure 1) could be expanded to use cases that could include e.g. syntax trees; pr
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Evolution and Paleontology Studies · Biomedical Text Mining and Ontologies
MethodsVariational Inference · Normalizing Flows
