Robust Regression with Compositional Covariates
Aditya Mishra, Christian L. Muller

TL;DR
This paper introduces RobRegCC, a robust regression framework for compositional biological data that effectively detects outliers and identifies associations with covariates, demonstrated on microbiome data with theoretical guarantees.
Contribution
The paper presents RobRegCC, a novel robust regression method with outlier detection and sparse modeling for compositional data, extending existing models with new penalties and validation schemes.
Findings
RobRegCC accurately detects outliers in simulated data.
It identifies meaningful microbial associations in HIV microbiome data.
The method provides theoretical guarantees for estimator performance.
Abstract
Many biological high-throughput data sets, such as targeted amplicon-based and metagenomic sequencing data, are compositional in nature. A common exploratory data analysis task is to infer statistical associations between the high-dimensional microbial compositions and habitat- or host-related covariates. We propose a general robust statistical regression framework, RobRegCC (Robust Regression with Compositional Covariates), which extends the linear log-contrast model by a mean shift formulation for capturing outliers. RobRegCC includes sparsity-promoting convex and non-convex penalties for parsimonious model estimation, a data-driven robust initialization procedure, and a novel robust cross-validation model selection scheme. We show RobRegCC's ability to perform simultaneous sparse log-contrast regression and outlier detection over a wide range of simulation settings and provide…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
