Testing for a common subspace in compositional datasets with structural zeros
Francesco Porro, Fabio Rapallo, Sara Sommariva

TL;DR
This paper introduces a statistical test to determine if two compositional datasets with structural zeros share a common subspace, addressing limitations of traditional methods that require all components to be positive.
Contribution
The authors develop a novel test for common subspace in compositional data with structural zeros, applicable under normal and nonparametric conditions, improving robustness over existing approaches.
Findings
The test effectively distinguishes shared subspaces from distinct ones in simulated data.
It performs well on microbiome datasets with structural zeros.
The method provides an analytical null distribution approximation for normally distributed data.
Abstract
In real world applications dealing with compositional datasets, it is easy to face the presence of structural zeros. The latter arise when, due to physical limitations, one or more variables are intrinsically zero for a subset of the population under study. The classical Aitchison approach requires all the components of a composition to be strictly positive, since the adaptation of the most widely used statistical techniques to the compositional framework relies on computing the logratios of these components. Therefore, datasets containing structural zeros are usually split in two subsets, the one containing the observations with structural zeros and the one containing all the other data. Then statistical analysis is performed on the two subsets separately, assuming the two datasets are drawn from two different subpopulations. However, this approach may lead to incomplete results when…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
