Supervised Learning and Model Analysis with Compositional Data
Shimeng Huang, Elisabeth Ailer, Niki Kilbertus, Niklas Pfister

TL;DR
KernelBiome is a nonparametric, kernel-based framework designed for compositional microbiome data, capturing complex signals and providing interpretable insights, outperforming existing methods in predictive tasks.
Contribution
It introduces KernelBiome, a novel kernel-based nonparametric regression and classification method tailored for sparse compositional data, incorporating prior knowledge and interpretability.
Findings
Achieves comparable or better predictive performance than state-of-the-art methods.
Provides novel interpretability tools for compositional data analysis.
Demonstrates effectiveness on real microbiome datasets.
Abstract
The compositionality and sparsity of high-throughput sequencing data poses a challenge for regression and classification. However, in microbiome research in particular, conditional modeling is an essential tool to investigate relationships between phenotypes and the microbiome. Existing techniques are often inadequate: they either rely on extensions of the linear log-contrast model (which adjusts for compositionality, but is often unable to capture useful signals), or they are based on black-box machine learning methods (which may capture useful signals, but ignore compositionality in downstream analyses). We propose KernelBiome, a kernel-based nonparametric regression and classification framework for compositional data. It is tailored to sparse compositional data and is able to incorporate prior knowledge, such as phylogenetic structure. KernelBiome captures complex signals,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeochemistry and Geologic Mapping · Oral microbiology and periodontitis research · Genomics and Phylogenetic Studies
