META$^\mathbf{2}$: Memory-efficient taxonomic classification and abundance estimation for metagenomics with deep learning
Andreas Georgiou, Vincent Fortuin, Harun Mustafa, Gunnar R\"atsch

TL;DR
This paper introduces META$^ extbf{2}$, a memory-efficient deep learning approach for taxonomic classification and abundance estimation in metagenomics, leveraging locality-sensitive hashing and multiple instance learning to outperform existing methods.
Contribution
The paper presents a novel memory-efficient deep learning method combining locality-sensitive hashing and MIL for improved taxonomic classification and abundance estimation in metagenomics.
Findings
Outperforms conventional mapping-based methods within fixed memory constraints.
Utilizes MIL with permutation-invariant pooling to exploit co-occurrence patterns.
Achieves higher accuracy in predicting taxa distribution at higher taxonomic ranks.
Abstract
Metagenomic studies have increasingly utilized sequencing technologies in order to analyze DNA fragments found in environmental samples.One important step in this analysis is the taxonomic classification of the DNA fragments. Conventional read classification methods require large databases and vast amounts of memory to run, with recent deep learning methods suffering from very large model sizes. We therefore aim to develop a more memory-efficient technique for taxonomic classification. A task of particular interest is abundance estimation in metagenomic samples. Current attempts rely on classifying single DNA reads independently from each other and are therefore agnostic to co-occurence patterns between taxa. In this work, we also attempt to take these patterns into account. We develop a novel memory-efficient read classification technique, combining deep learning and locality-sensitive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Gene expression and cancer classification · Algorithms and Data Compression
