latentcor: An R Package for estimating latent correlations from mixed data types
Mingze Huang, Christian L. M\"uller, Irina Gaynanova

TL;DR
The paper introduces `latentcor`, an R package that accurately estimates correlations among mixed data types using semi-parametric latent Gaussian copula models, suitable for high-throughput data analysis.
Contribution
It provides a comprehensive R package implementing latent Gaussian copula models for mixed data correlation estimation, improving accuracy and efficiency over traditional methods.
Findings
Efficient multi-linear interpolation reduces memory usage.
Accurate correlation estimation across diverse mixed data types.
Applicable to high-throughput data analysis workflows.
Abstract
We present `latentcor`, an R package for correlation estimation from data with mixed variable types. Mixed variables types, including continuous, binary, ordinal, zero-inflated, or truncated data are routinely collected in many areas of science. Accurate estimation of correlations among such variables is often the first critical step in statistical analysis workflows. Pearson correlation as the default choice is not well suited for mixed data types as the underlying normality assumption is violated. The concept of semi-parametric latent Gaussian copula models, on the other hand, provides a unifying way to estimate correlations between mixed data types. The R package `latentcor` comprises a comprehensive list of these models, enabling the estimation of correlations between any of continuous/binary/ternary/zero-inflated (truncated) variable types. The underlying implementation takes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
