LRez: C++ API and toolkit for analyzing and managing Linked-Reads data
Pierre Morisse, Claire Lemaitre, Fabrice Legeai

TL;DR
LRez is a C++ toolkit designed to facilitate the analysis and management of Linked-Reads data, enabling efficient barcode processing for genomics applications.
Contribution
It introduces the first dedicated C++ API and toolkit for manipulating Linked-Reads data, supporting various functionalities for genomic analysis.
Findings
Enables fast extraction and querying of barcodes from BAM and FASTQ files.
Supports computation of shared barcodes between genomic regions.
Improves performance of applications using Linked-Reads data.
Abstract
Linked-Reads technologies, such as 10x Genomics, combine both the high-quality and low cost of short-reads sequencing and a long-range information, through the use of barcodes able to tag reads which originate from a common long DNA fragment. This technology has been employed in a broad range of applications including assembly or phasing of genomes, and structural variant calling. However, to date, no tool or API dedicated to the manipulation of Linked-Reads data exist. We introduce LRez, a C++ API and toolkit which allows easy management of Linked-Reads data. LRez includes various functionalities, for computing number of common barcodes between genomic regions, extracting barcodes from BAM files, as well as indexing and querying both BAM and FASTQ files to quickly fetch reads or alignments sharing one or multiple barcodes. LRez can thus be used in a broad range of applications…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Algorithms and Data Compression · Advanced Data Storage Technologies
MethodsBottleneck Attention Module
