# Reconstructing rearrangement phylogenies of natural genomes

**Authors:** Leonard Bohnenkämper, Jens Stoye, Daniel Doerr

PMC · DOI: 10.1186/s13015-025-00279-5 · Algorithms for Molecular Biology : AMB · 2025-06-07

## TL;DR

This paper presents an improved method for reconstructing ancestral genomes using a computational model that handles complex genome rearrangements and chromosomal structures.

## Contribution

A highly optimized ILP approach for solving the Small Parsimony Problem under the DCJ-indel model, with improved handling of chromosomal structures.

## Key findings

- The optimized ILP method shows significant performance improvements on simulated phylogenies with linear chromosomes.
- The method outperforms previous approaches even when the true chromosomal structure is circular.
- Practical benefits are demonstrated in an analysis of seven Anopheles species.

## Abstract

We study the classical problem of inferring ancestral genomes from a set of extant genomes under a given phylogeny, known as the Small Parsimony Problem (SPP). Genomes are represented as sequences of oriented markers, organized in one or more linear or circular chromosomes. Any marker may appear in several copies, without restriction on orientation or genomic location, known as the natural genomes model. Evolutionary events along the branches of the phylogeny encompass large scale rearrangements, including segmental inversions, translocations, gain and loss (DCJ-indel model). Even under simpler rearrangement models, such as the classical breakpoint model without duplicates, the SPP is computationally intractable. Nevertheless, the SPP for natural genomes under the DCJ-indel model has been studied recently, with limited success.

Building on prior work, we present a highly optimized ILP that is able to solve the SPP for sufficiently small phylogenies and gene families. A notable improvement w.r.t. the previous result is an optimized way of handling both circular and linear chromosomes. This is especially relevant to the SPP, since the chromosomal structure of ancestral genomes is unknown and the solution space for this chromosomal structure is typically large.

We benchmark our method on simulated and real data. On simulated phylogenies we observe a considerable performance improvement on problems that include linear chromosomes. And even when the ground truth contains only one circular chromosome per genome, our method outperforms its predecessor due to its optimized handling of the solution space. The practical advantage becomes also visible in an analysis of seven Anopheles taxa.

## Linked entities

- **Species:** Anopheles (taxon 7164)

## Full-text entities

- **Species:** Anopheles (series) [taxon 44484]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12144824/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12144824/full.md

---
Source: https://tomesphere.com/paper/PMC12144824