Compositional Covariance Shrinkage and Regularised Partial Correlations
Suzanne Jin, Cedric Notredame, Ionas Erb

TL;DR
This paper introduces a novel estimation method for covariance and partial correlations in compositional data, addressing interdependence issues and zero imputation effects, with applications demonstrated on simulated and gene expression data.
Contribution
It presents a new approach for covariance estimation in compositional data using bespoke shrinkage targets based on logratio variables, improving partial correlation estimates.
Findings
Effective covariance estimation for compositional data.
Analytical derivation of partial correlations induced by data closure.
Evaluation of zero imputation methods on gene expression data.
Abstract
We propose an estimation procedure for covariation in wide compositional data sets. For compositions, widely-used logratio variables are interdependent due to a common reference. Logratio uncorrelated compositions are linearly independent before the unit-sum constraint is imposed. We show how they are used to construct bespoke shrinkage targets for logratio covariance matrices and test a simple procedure for partial correlation estimates on both a simulated and a single-cell gene expression data set. For the underlying counts, different zero imputations are evaluated. The partial correlation induced by the closure is derived analytically. Data and code are available from GitHub.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeochemistry and Geologic Mapping · Mineral Processing and Grinding · Hydrocarbon exploration and reservoir analysis
