Principal Amalgamation Analysis for Microbiome Data
Yan Li, Gen Li, Kun Chen

TL;DR
This paper introduces Principal Amalgamation Analysis (PAA), a new dimension reduction method for microbiome data that leverages taxonomic structure to efficiently summarize high-dimensional, sparse datasets while preserving diversity information.
Contribution
The paper presents PAA, a novel amalgamation-based, taxonomy-guided dimension reduction technique with scalable hierarchical algorithms and visualization tools for microbiome data analysis.
Findings
PAA effectively reduces dimensionality while preserving diversity.
Hierarchical PAA enables scalable computation for large datasets.
Demonstrated success on microbiome datasets from infant and HIV studies.
Abstract
In recent years microbiome studies have become increasingly prevalent and large-scale. Through high-throughput sequencing technologies and well-established analytical pipelines, relative abundance data of operational taxonomic units and their associated taxonomic structures are routinely produced. Since such data can be extremely sparse and high dimensional, there is often a genuine need for dimension reduction to facilitate data visualization and downstream statistical analysis. We propose Principal Amalgamation Analysis (PAA), a novel amalgamation-based and taxonomy-guided dimension reduction paradigm for microbiome data. Our approach aims to aggregate the compositions into a smaller number of principal compositions, guided by the available taxonomic structure, by minimizing a properly measured loss of information. The choice of the loss function is flexible and can be based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGut microbiota and health · Genomics and Phylogenetic Studies · Metabolomics and Mass Spectrometry Studies
