Modelling phylogeny in 16S rRNA gene sequencing datasets using string-based kernels
Jonathan Ish-Horowicz, Sarah Filippi

TL;DR
This paper introduces a novel string kernel-based method to model phylogenetic relationships in 16S rRNA gene sequencing data, improving statistical analysis of microbiome datasets and host trait prediction.
Contribution
It proposes a new family of kernels leveraging string kernels from NLP to analyze microbiome phylogeny, demonstrating their effectiveness in statistical tasks.
Findings
Kernel two-sample test detects phylogenetic differences effectively.
Gaussian process models infer bacterial-host effects across phylogeny.
Method outperforms traditional approaches in simulation studies.
Abstract
The bacterial microbiome is increasingly being recognised as a key factor in human health, driven in large part by datasets collected using 16S rRNA (ribosomal ribonucleic acid) gene sequencing, which enable cost-effective quantification of the composition of an individual's bacterial community. One of the defining characteristics of 16S rRNA datasets is the evolutionary relationships that exist between taxa (phylogeny). Here, we demonstrate the utility of modelling these phylogenetic relationships in two statistical tasks (the two sample test and host trait prediction) and propose a novel family of kernels for analysing microbiome datasets by leveraging string kernels from the natural language processing literature. We show via simulation studies that a kernel two-sample test using the proposed kernel is sensitive to the phylogenetic scale of the difference between the two populations.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Gene expression and cancer classification · Gut microbiota and health
