MEEPTOOLS: A maximum expected error based FASTQ read filtering and trimming toolkit
Vishal N. Koparde, Hardik I. Parikh, Steven P. Bradley, Nihar U., Sheth

TL;DR
MEEPTOOLS is an open-source toolkit that improves quality control in next-generation sequencing by using a maximum expected error metric to filter and trim FASTQ reads more effectively than traditional methods.
Contribution
It introduces a novel approach based on maximum expected error percentages for read filtering and trimming, enhancing data reliability in sequencing analysis.
Findings
Retains more reliable bases than traditional methods
Removes more unreliable bases effectively
Provides a non-logarithmic quality assessment
Abstract
Next generation sequencing technology rapidly produces massive volume of data and quality control of this sequencing data is essential to any genomic analysis. Here we present MEEPTOOLS, which is a collection of open-source tools based on maximum expected error as a percentage of read length (MEEP score) to filter, trim, truncate and assess next generation DNA sequencing data in FASTQ file format. MEEPTOOLS provides a non-traditional approach towards read filtering/trimming based on maximum error probabilities of the bases in the read on a non-logarithmic scale. This method simultaneously retains more reliable bases and removes more unreliable bases than the traditional quality filtering strategies.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Algorithms and Data Compression · RNA and protein synthesis mechanisms
