Causal Clustering for 1-Factor Measurement Models on Data with Various Types
Shuyan Wang

TL;DR
This paper extends the applicability of tetrad-based causal discovery algorithms to mixed data types, including discrete and continuous variables, by proving the validity of tetrad constraints in these cases and demonstrating their effectiveness through simulations.
Contribution
It proves that tetrad constraints can be used for causal clustering in mixed data types, broadening the scope of existing algorithms beyond Gaussian and binary cases.
Findings
FOFC performs well on mixed data in simulations
Tetrad constraints are valid for mixed data types
Algorithms can detect latent variables in diverse data types
Abstract
The tetrad constraint is a condition of which the satisfaction signals a rank reduction of a covariance submatrix and is used to design causal discovery algorithms that detects the existence of latent (unmeasured) variables, such as FOFC. Initially such algorithms only work for cases where the measured and latent variables are all Gaussian and have linear relations (Gaussian-Gaussian Case). It has been shown that a unidimentional latent variable model implies tetrad constraints when the measured and latent variables are all binary (Binary-Binary case). This paper proves that the tetrad constraint can also be entailed when the measured variables are of mixed data types and when the measured variables are discrete and the latent common causes are continuous, which implies that any clustering algorithm relying on this constraint can work on those cases. Each case is shown with an example…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Rough Sets and Fuzzy Logic · Multi-Criteria Decision Making
