NanoPrePro: a fully equipped, fast, and memory-efficient preprocessor for nanopore transcriptomic sequencing
Chia-Chen Chu, Jhong-He Yu, Shang-Che Kuo, Fan-Wei Yang, Chia-Chang Lin, Chang-Hung Chen, Yi-Chen Wu, Cing Shih, Ying-Hsuan Sun, Te-Lun Mai, Ying-Lan Chen, Hsin-Hung Lin, Jung-Chen Su, Ying-Chung Jimmy Lin

TL;DR
NanoPrePro is a fast and efficient tool for processing nanopore sequencing data, offering better performance than existing methods.
Contribution
NanoPrePro introduces a self-optimizing function and is significantly faster and more memory-efficient than current tools.
Findings
NanoPrePro outperforms Pychopper in simulated and real datasets.
It is 38 times faster with lower memory usage.
The tool offers customizable parameters for better precision in read preprocessing.
Abstract
NanoPrePro is a streamlined read preprocessor specifically designed for high precision in identifying full-length reads from Oxford Nanopore Technology (ONT) transcriptomic sequencing results, achieved through the precise identification of adapters/primers. However, the preprocessing of ONT reads has been a long-term neglected and ambiguous area without thorough and systematic investigation. Here, we developed NanoPrePro that outperformed the current best preprocessor, Pychopper, using simulated and real datasets. Through sequence similarity, adapter/primer location, and adapter/primer length, NanoPrePro exerted a self-optimizing function to extract the best parameters in each sequencing file for users to customize their analyses. Furthermore, NanoPrePro shows a 38-times faster speed with less memory cost. NanoPrePro can be regarded as the state-of-the-art preprocessor with forward…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Single-cell and spatial transcriptomics · Gene expression and cancer classification
