A Graph Auto-Encoder for Haplotype Assembly and Viral Quasispecies Reconstruction
Ziqi Ke, Haris Vikalo

TL;DR
This paper introduces a graph auto-encoder neural network framework that improves the accuracy of reconstructing haplotypes and viral quasispecies from DNA sequencing data, outperforming existing methods.
Contribution
A novel graph auto-encoder approach tailored for haplotype and viral community reconstruction, leveraging structural data properties for enhanced accuracy.
Findings
Reliable assembly of haplotypes demonstrated on synthetic and experimental data
Significant performance improvements over state-of-the-art methods
Effective in ignoring sequencing errors during reconstruction
Abstract
Reconstructing components of a genomic mixture from data obtained by means of DNA sequencing is a challenging problem encountered in a variety of applications including single individual haplotyping and studies of viral communities. High-throughput DNA sequencing platforms oversample mixture components to provide massive amounts of reads whose relative positions can be determined by mapping the reads to a known reference genome; assembly of the components, however, requires discovery of the reads' origin -- an NP-hard problem that the existing methods struggle to solve with the required level of accuracy. In this paper, we present a learning framework based on a graph auto-encoder designed to exploit structural properties of sequencing data. The algorithm is a neural network which essentially trains to ignore sequencing errors and infers the posteriori probabilities of the origin of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · RNA and protein synthesis mechanisms · Chromosomal and Genetic Variations
