# TARO: tree-aggregated factor regression for microbiome data integration

**Authors:** Aditya K Mishra, Iqbal Mahmud, Philip L Lorenzi, Robert R Jenq, Jennifer A Wargo, Nadim J Ajami, Christine B Peterson

PMC · DOI: 10.1093/bioinformatics/btae321 · 2024-05-24

## TL;DR

TARO is a new method that integrates microbiome and metabolomic data to better understand how gut microbes influence metabolites, especially in colorectal cancer screening.

## Contribution

TARO introduces a tree-aggregated factor regression approach that leverages taxonomic structure to handle microbiome data challenges.

## Key findings

- TARO accurately recovers low-rank coefficient matrices in simulations.
- TARO identifies relevant features in microbiome-metabolomic associations.
- TARO was applied to colorectal cancer screening data to explore gut microbe-metabolite relationships.

## Abstract

Although the human microbiome plays a key role in health and disease, the biological mechanisms underlying the interaction between the microbiome and its host are incompletely understood. Integration with other molecular profiling data offers an opportunity to characterize the role of the microbiome and elucidate therapeutic targets. However, this remains challenging to the high dimensionality, compositionality, and rare features found in microbiome profiling data. These challenges necessitate the use of methods that can achieve structured sparsity in learning cross-platform association patterns.

We propose Tree-Aggregated factor RegressiOn (TARO) for the integration of microbiome and metabolomic data. We leverage information on the taxonomic tree structure to flexibly aggregate rare features. We demonstrate through simulation studies that TARO accurately recovers a low-rank coefficient matrix and identifies relevant features. We applied TARO to microbiome and metabolomic profiles gathered from subjects being screened for colorectal cancer to understand how gut microrganisms shape intestinal metabolite abundances.

The R package TARO implementing the proposed methods is available online at https://github.com/amishra-stats/taro-package.

## Linked entities

- **Diseases:** colorectal cancer (MONDO:0005575)

## Full-text entities

- **Diseases:** colorectal cancer (MESH:D015179)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11193058/full.md

---
Source: https://tomesphere.com/paper/PMC11193058