Instrumental Variable Estimation for Compositional Treatments
Elisabeth Ailer, Christian L. M\"uller, Niki Kilbertus

TL;DR
This paper develops causal inference methods for compositional data, such as microbiome and ecological datasets, highlighting pitfalls and proposing multivariate transformations and regression techniques for valid cause-effect analysis.
Contribution
It introduces a causal framework for compositional data using instrumental variables and develops new multivariate methods that respect the data's structure.
Findings
Methods outperform traditional approaches on synthetic data
Application to real microbiome data demonstrates practical utility
Highlights pitfalls in interpreting compositional causes
Abstract
Many scientific datasets are compositional in nature. Important biological examples include species abundances in ecology, cell-type compositions derived from single-cell sequencing data, and amplicon abundance data in microbiome research. Here, we provide a causal view on compositional data in an instrumental variable setting where the composition acts as the cause. First, we crisply articulate potential pitfalls for practitioners regarding the interpretation of compositional causes from the viewpoint of interventions and warn against attributing causal meaning to common summary statistics such as diversity indices in microbiome data analysis. We then advocate for and develop multivariate methods using statistical data transformations and regression techniques that take the special structure of the compositional sample space into account while still yielding scientifically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeochemistry and Geologic Mapping
