# Exemplar or Matching: Modeling DCJ Problems with Unequal Content Genome   Data

**Authors:** Zhaoming Yin, Jijun Tang, Stephen W. Schaeffer, David A. Bader

arXiv: 1705.06559 · 2017-05-29

## TL;DR

This paper compares two methods for modeling genome rearrangement problems involving duplications and indels, developing algorithms to compute exact distances and median genomes, and evaluating their performance on synthetic and real data.

## Contribution

It introduces optimized branch-and-bound algorithms for exact distance computation and median genome problems under two duplication models, with comprehensive experimental evaluation.

## Key findings

- DCJ-Indel-Exemplar distance performs better on certain datasets.
- DCJ-Indel-Matching distance offers advantages in other scenarios.
- The median computation methods show different strengths depending on data characteristics.

## Abstract

The edit distance under the DCJ model can be computed in linear time for genomes with equal content or with Indels. But it becomes NP-Hard in the presence of duplications, a problem largely unsolved especially when Indels are considered. In this paper, we compare two mainstream methods to deal with duplications and associate them with Indels: one by deletion, namely DCJ-Indel-Exemplar distance; versus the other by gene matching, namely DCJ-Indel-Matching distance. We design branch-and-bound algorithms with set of optimization methods to compute exact distances for both. Furthermore, median problems are discussed in alignment with both of these distance methods, which are to find a median genome that minimizes distances between itself and three given genomes. Lin-Kernighan (LK) heuristic is leveraged and powered up by sub-graph decomposition and search space reduction technologies to handle median computation. A wide range of experiments are conducted on synthetic data sets and real data sets to show pros and cons of these two distance metrics per se, as well as putting them in the median computation scenario.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1705.06559/full.md

## Figures

18 figures with captions in the complete paper: https://tomesphere.com/paper/1705.06559/full.md

## References

32 references — full list in the complete paper: https://tomesphere.com/paper/1705.06559/full.md

---
Source: https://tomesphere.com/paper/1705.06559