Kernel-Penalized Regression for Analysis of Microbiome Data
Timothy W. Randolph, Sen Zhao, Wade Copeland, Meredith Hullar, Ali, Shojaie

TL;DR
This paper introduces a kernel-penalized regression framework that integrates ecological similarity measures into high-dimensional microbiome data analysis, enabling more nuanced association modeling.
Contribution
It extends traditional regression models by incorporating multiple similarity matrices, including phylogenetic information, and addresses compositional data challenges in microbiome studies.
Findings
Effective incorporation of ecological similarity matrices into regression models.
Application to gut and vaginal microbiome data demonstrates practical utility.
Significance testing for microbial associations enhances interpretability.
Abstract
The analysis of human microbiome data is often based on dimension-reduced graphical displays and clustering derived from vectors of microbial abundances in each sample. Common to these ordination methods is the use of biologically motivated definitions of similarity. Principal coordinate analysis, in particular, is often performed using ecologically defined distances, allowing analyses to incorporate context-dependent, non-Euclidean structure. Here we describe how to take a step beyond ordination plots and incorporate this structure into high-dimensional penalized regression models. Within this framework, the estimate of a regression coefficient vector is obtained via the joint eigen properties of multiple similarity matrices, or kernels. This allows for multivariate regression models to incorporate both a matrix of microbial abundances and, for instance, a matrix of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
