TL;DR
AirLift is a novel tool that efficiently remaps sequencing reads between different reference genomes, significantly reducing time and maintaining high accuracy for downstream genomic analyses.
Contribution
It introduces the first comprehensive read remapping technique that is faster than full mapping and suitable for multiple reference genome versions.
Findings
Remaps reads up to 27.4x faster than full mapping.
Maintains high accuracy in SNP/INDEL variant detection.
Validated with GATK on real datasets.
Abstract
AirLift is the first read remapping tool that enables users to quickly and comprehensively map a read set, that had been previously mapped to one reference genome, to another similar reference. Users can then quickly run a downstream analysis of read sets for each latest reference release. Compared to the state-of-the-art method for remapping reads (i.e., full mapping), AirLift reduces the overall execution time to remap read sets between two reference genome versions by up to 27.4x. We validate our remapping results with GATK and find that AirLift provides high accuracy in identifying ground truth SNP/INDEL variants AirLift source code and readme describing how to reproduce our results are available at https://github.com/CMU-SAFARI/AirLift.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
