# Pangenome-guided sequence assembly via binary optimization

**Authors:** Josh Cudby, James Bonfield, Chenxi Zhou, Richard Durbin, Sergii Strelchuk

PMC · DOI: 10.1093/bib/bbag084 · Briefings in Bioinformatics · 2026-02-26

## TL;DR

This paper introduces a new method for genome assembly that uses pangenome graphs and optimization techniques to reduce bias and improve accuracy in complex regions.

## Contribution

The paper introduces a novel framework for pangenome-guided assembly framed as a graph traversal optimization problem, suitable for classical and quantum computing.

## Key findings

- The approach significantly reduces the number of contigs compared to de novo assemblers on simulated data.
- Optimization-based methods are competitive with exhaustive search techniques and more resilient to copy number estimation noise.
- The method shows potential for scalability and effectiveness on quantum computers, demonstrated through a small real-device experiment.

## Abstract

De novo genome assembly is challenging in highly repetitive regions; however, reference-guided assemblers often suffer from bias. We propose a framework for pangenome-guided sequence assembly that can resolve short-read data in complex regions without bias towards a single reference genome. Our primary contribution is to frame the assembly as a graph traversal optimization problem, which can be implemented classically or on a quantum computer. The workflow involves first annotating pangenome graphs with estimated copy numbers for each node, then finding a path on the graph that best explains those copy numbers. On simulated data, our approach significantly reduces the number of contigs compared with de novo assemblers. While they introduce a small increase in inaccuracies, such as false joins, our optimization-based methods are competitive with current exhaustive search techniques. They are also designed to scale more efficiently as the problem size grows and will run effectively on future quantum computers; a small experiment on a real quantum device showcases this behaviour. Moreover, they are more resilient to noise in copy number estimation inherent in short-read-based assembly. We also develop novel tools for creating realistic synthetic pangenomes, aligning reads to pangenomes and for evaluating assembly quality.

## Full-text entities

- **Diseases:** bleeding (MESH:D006470), QAOA (MESH:D007859)
- **Chemicals:** Gurobi (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12936794/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12936794/full.md

## References

30 references — full list in the complete paper: https://tomesphere.com/paper/PMC12936794/full.md

---
Source: https://tomesphere.com/paper/PMC12936794