Towards a Taxonomical Consensus: Diversity and Richness Inference from Large Scale rRNA gene Analysis
Dimitris Papamichail, Celine C. Lesaulnier, Steven Skiena, Sean R., McCorkle, Bernard Ollivier, Daniel van der Lelie

TL;DR
This paper presents a bioinformatics approach combining homology-based classification and clustering methods to accurately analyze microbial diversity and richness from large-scale 16S rRNA gene datasets, validated on soil microbial communities.
Contribution
It introduces an optimized blast classifier and clustering strategy that improves phylogenetic profiling and richness estimation in large microbial community analyses.
Findings
Homology-based classification matches RDP accuracy with better sequence association.
Complete linkage clustering enhances richness and evenness calculations.
Validated methodology on a dataset of approximately 2300 sequences from soil microbes.
Abstract
Population analysis is persistently challenging but important, leading to the determination of diversity and function prediction of microbial community members. Here we detail our bioinformatics methods for analyzing population distribution and diversity in large microbial communities. This was achieved via (i) a homology based method for robust phylotype determination, equaling the classification accuracy of the Ribosomal Database Project (RDP) classifier, but providing improved associations of closely related sequences; (ii) a comparison of different clustering methods for achieving more accurate richness estimations. Our methodology, which we developed using the RDP vetted 16S rRNA gene sequence set, was validated by testing it on a large 16S rRNA gene dataset of approximately 2300 sequences, which we obtained from a soil microbial community study. We concluded that the best approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMicrobial Community Ecology and Physiology · Genomics and Phylogenetic Studies · Gut microbiota and health
