{\Pi}-cyc: A Reference-free SNP Discovery Application using Parallel Graph Search
Reda Younsi, Jing Tang, Liisa Holm

TL;DR
This paper presents $\Pi$-cyc, a parallel, reference-free SNP discovery tool that efficiently enumerates cycles in coloured de Bruijn graphs for genome analysis, leveraging multi-core and distributed computing.
Contribution
It introduces a novel parallel graph search algorithm for cycle enumeration in coloured de Bruijn graphs, improving speed and scalability for SNP discovery.
Findings
Achieved faster cycle enumeration using multi-core parallelism.
Successfully applied $\Pi$-cyc to genomes of Schizosaccharomyces Pombe.
Open-source implementation available for community use.
Abstract
Motivation: Working with a large number of genomes simultaneously is of great interest in genetic population and comparative genomics research. Bubbles discovery in multi-genomes coloured de bruijn graph for de novo genome assembly is a problem that can be translated to cycles enumeration in graph theory. Cycle enumerations algorithms in big and complex de Bruijn graphs are time consuming. Specialised fast algorithms for efficient bubble search are needed for coloured de bruijn graph variant calling applications. In coloured de Bruijn graphs, bubble paths coverages are used in downstream variants calling analysis. Results: In this paper, we introduce a fast parallel graph search for different K-mer cycle sizes. Coloured path coverages are used for SNP prediction. The graph search method uses a combined multi-node and multi-core design to speeds up cycles enumeration. The search…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · RNA and protein synthesis mechanisms · Chromosomal and Genetic Variations
