KATKA: A KRAKEN-like tool with $k$ given at query time
Travis Gagie, Sana Kashgouli, Ben Langmead

TL;DR
KATKA is a new tool that efficiently identifies the smallest subtree containing genomes with specific k-mers, allowing k to be specified at query time, unlike prior tools like KRAKEN.
Contribution
It introduces a data structure enabling flexible k-mer queries in phylogenetic trees, with k specified at query time, enhancing KRAKEN's capabilities.
Findings
Supports rapid subtree identification for given k-mers
Allows k to be specified dynamically at query time
Improves flexibility over existing tools like KRAKEN
Abstract
We describe a new tool, KATKA, that stores a phylogenetic tree such that later, given a pattern and an integer , it can quickly return the root of the smallest subtree of containing all the genomes in which the -mer occurs, for . This is similar to KRAKEN's functionality but with given at query time instead of at construction time.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Algorithms and Data Compression · Biochemical and Structural Characterization
