Blini: lightweight nucleotide sequence search and dereplication
Amit Lavon

TL;DR
Blini is a fast, memory-efficient tool for nucleotide sequence search and dereplication, aiding in the management of large sequence collections with high accuracy.
Contribution
It introduces a lightweight, efficient method for nucleotide sequence lookup and dereplication that outperforms existing tools in speed and memory usage.
Findings
Faster than existing tools on simulated data
Uses less RAM while maintaining accuracy
Effective for large sequence collections
Abstract
Blini is a tool for quick lookup of nucleotide sequences in databases, and for quick dereplication of sequence collections. It is meant to help clean and characterize large collections of assembled contigs or long sequences that would otherwise be too big to search with online tools, or too demanding for a local machine to process. Benchmarks on simulated data demonstrate that it is faster than existing tools and requires less RAM, while preserving search and clustering accuracy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Fractal and DNA sequence analysis · Machine Learning in Bioinformatics
