A consistent least-squares criterion for calibrating edge lengths in phylogenetic networks
Jingcheng Xu, C\'ecile An\'e

TL;DR
This paper introduces a consistent and scalable least-squares method for estimating edge lengths in phylogenetic networks from genetic distances, accounting for rate variation and known topology.
Contribution
It proposes a novel least-squares criterion that decomposes genetic distances into invariant and non-invariant parts, enabling consistent edge length estimation in phylogenetic networks.
Findings
The criterion is consistent if a tree path exists between some tips.
Edge lengths are identifiable from average genetic distances.
A constrained variant estimates relative times assuming a molecular clock.
Abstract
In phylogenetic networks, it is desirable to estimate edge lengths in substitutions per site or calendar time. Yet, there is a lack of scalable methods that provide such estimates. Here we consider the problem of obtaining edge length estimates from genetic distances, in the presence of rate variation across genes and lineages, when the network topology is known. We propose a novel criterion based on least-squares that is both consistent and computationally tractable. The crux of our approach is to decompose the genetic distances into two parts, one of which is invariant across displayed trees of the network. The scaled genetic distances are then fitted to the invariant part, while the average scaled genetic distances are fitted to the non-invariant part. We show that this criterion is consistent provided that there exists a tree path between some pair of tips in the network, and that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolution and Paleontology Studies · Genomics and Phylogenetic Studies · Genetic diversity and population structure
