Pangenome-guided sequence assembly via binary optimisation
Josh Cudby, James Bonfield, Chenxi Zhou, Richard Durbin, Sergii Strelchuk

TL;DR
This paper introduces a pangenome-guided sequence assembly framework that formulates the problem as a graph traversal optimisation, leveraging classical and quantum computing to improve assembly in complex, repetitive regions.
Contribution
It presents a novel graph traversal optimisation approach for pangenome-guided assembly, scalable to quantum computers and more resilient to noise, with new tools for synthetic pangenome creation and evaluation.
Findings
Significantly reduces contigs compared to de novo assemblers
Maintains competitive accuracy with current exhaustive search methods
Demonstrates scalability and noise resilience, including a quantum device experiment
Abstract
De novo genome assembly is challenging in highly repetitive regions; however, reference-guided assemblers often suffer from bias. We propose a framework for pangenome-guided sequence assembly, which can resolve short-read data in complex regions without bias towards a single reference genome. Our primary contribution is to frame the assembly as a graph traversal optimisation problem, which can be implemented classically or on a quantum computer. The workflow involves first annotating pangenome graphs with estimated copy numbers for each node, then finding a path on the graph that best explains those copy numbers. On simulated data, our approach significantly reduces the number of contigs compared to de novo assemblers. While they introduce a small increase in inaccuracies, such as false joins, our optimisation-based methods are competitive with current exhaustive search techniques. They…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
