Perturbation-based Effect Measures for Compositional Data
Anton Rask Lundborg, Niklas Pfister

TL;DR
This paper introduces a new framework for effect measures in compositional data, addressing high-dimensionality and sparsity issues, and providing unbiased, interpretable estimates of how composition summaries influence responses.
Contribution
It proposes a novel perturbation-based approach for effect measurement in compositional data, improving bias correction and estimation efficiency over existing methods.
Findings
Effective in high-dimensional, sparse microbiome data
Demonstrates bias reduction compared to traditional methods
Outperforms existing techniques on real datasets
Abstract
Existing effect measures for compositional features are inadequate for many modern applications, for example, in microbiome research, since they display traits such as high-dimensionality and sparsity that can be poorly modelled with traditional parametric approaches. Further, assessing -- in an unbiased way -- how summary statistics of a composition (e.g., racial diversity) affect a response variable is not straightforward. We propose a framework based on hypothetical data perturbations which defines interpretable statistical functionals on the compositions themselves, which we call average perturbation effects. These effects naturally account for confounding that biases frequently used marginal dependence analyses. We show how average perturbation effects can be estimated efficiently by deriving a perturbation-dependent reparametrization and applying semiparametric estimation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeochemistry and Geologic Mapping · Oral microbiology and periodontitis research · Hydrocarbon exploration and reservoir analysis
