# TCVS: tree-guided compositional variable selection analysis of microbiome data

**Authors:** Yicong Mao, Zhiwen Jiang, Tianying Wang, Yijuan Hu, Xiang Zhan

PMC · DOI: 10.1093/bioinformatics/btaf617 · 2025-11-09

## TL;DR

This paper introduces TCVS, a new method for identifying microbes linked to health outcomes by using evolutionary relationships and improving selection accuracy.

## Contribution

The novel contribution is a tree-guided variable selection method with knockoff features to reduce false positives in microbiome data analysis.

## Key findings

- TCVS outperforms existing methods in accurately selecting disease-associated microbial taxa.
- The method successfully identifies gut microbes associated with body mass index in real data.
- Using taxonomic trees improves the detection of meaningful microbial associations.

## Abstract

Studies of microbial communities, represented by the relative abundances of taxa at various taxonomic levels, have underscored the significance of microbiota in numerous aspects of human health and disease. A pivotal challenge in microbiome research lies in pinpointing microbial taxa associated with disease outcomes, which could play crucial roles in prevention, detection, and treatment of various health conditions. Alongside these relative abundance data, taxonomic information sometimes offers a unique lens to explore the impact of shared evolutionary histories on patterns of microbial abundance.

In pursuit of this goal, we utilize the tree structure to more flexibly identify taxa associated with disease outcomes. To enhance the accuracy of our selection process, we introduce auxiliary knockoff copies of microbiome features designated as noise. This approach allows for the assessment of false positives in the selection process and aids in refining it towards more precise outcomes. Extensive numerical simulations demonstrate that our methodology outperforms several existing methods in terms of selection accuracy. Furthermore, we demonstrate the practicality of our approach by applying it to a widely used gut microbiome dataset, identifying microbial taxa linked to body mass index.

TCVS R code is available at https://github.com/Yicong1225/TCVS.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12629236/full.md

---
Source: https://tomesphere.com/paper/PMC12629236