KMC 3: counting and manipulating k-mer statistics
Marek Kokot, Maciej D{\l}ugosz, Sebastian Deorowicz

TL;DR
KMC 3 is an improved algorithm and toolset for efficient counting and manipulation of k-mer statistics in bioinformatics, enabling faster processing of large datasets.
Contribution
The paper introduces KMC 3, a significantly enhanced version of KMC 2, with new tools for handling k-mer databases in bioinformatics applications.
Findings
Demonstrates usefulness on real bioinformatics problems
Provides faster k-mer counting performance
Offers freely available tools for the community
Abstract
Summary: Counting all k-mers in a given dataset is a standard procedure in many bioinformatics applications. We introduce KMC3, a significant improvement of the former KMC2 algorithm together with KMC tools for manipulating k-mer databases. Usefulness of the tools is shown on a few real problems. Availability: Program is freely available at http://sun.aei.polsl.pl/REFRESH/kmc. Contact: [email protected]
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Algorithms and Data Compression · Genomics and Phylogenetic Studies
