# Regression models for compositional data: General log-contrast   formulations, proximal optimization, and microbiome data applications

**Authors:** Patrick L. Combettes, Christian L. M\"uller

arXiv: 1903.01050 · 2019-03-05

## TL;DR

This paper introduces a versatile convex optimization framework for linear log-contrast regression models tailored for compositional data, with applications in microbiome research, offering exact solutions and convergence guarantees.

## Contribution

It presents a general convex optimization model for log-contrast regression that encompasses previous methods, along with a proximal algorithm with proven convergence.

## Key findings

- The proposed method performs well on soil microbiome data.
- The approach effectively models gut microbiome compositions.
- The algorithm guarantees exact solutions with rigorous convergence.

## Abstract

Compositional data sets are ubiquitous in science, including geology, ecology, and microbiology. In microbiome research, compositional data primarily arise from high-throughput sequence-based profiling experiments. These data comprise microbial compositions in their natural habitat and are often paired with covariate measurements that characterize physicochemical habitat properties or the physiology of the host. Inferring parsimonious statistical associations between microbial compositions and habitat- or host-specific covariate data is an important step in exploratory data analysis. A standard statistical model linking compositional covariates to continuous outcomes is the linear log-contrast model. This model describes the response as a linear combination of log-ratios of the original compositions and has been extended to the high-dimensional setting via regularization. In this contribution, we propose a general convex optimization model for linear log-contrast regression which includes many previous proposals as special cases. We introduce a proximal algorithm that solves the resulting constrained optimization problem exactly with rigorous convergence guarantees. We illustrate the versatility of our approach by investigating the performance of several model instances on soil and gut microbiome data analysis tasks.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.01050/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1903.01050/full.md

## References

29 references — full list in the complete paper: https://tomesphere.com/paper/1903.01050/full.md

---
Source: https://tomesphere.com/paper/1903.01050