# Causal Inference on Multivariate and Mixed-Type Data

**Authors:** Alexander Marx, Jilles Vreeken

arXiv: 1702.06385 · 2017-10-17

## TL;DR

This paper introduces a new information-theoretic method using MDL and a greedy algorithm called Crack to accurately infer causal directions in multivariate and mixed-type data, addressing a complex problem in causal inference.

## Contribution

It proposes a novel MDL-based scoring method for mixed data types and a fast greedy algorithm, Crack, for causal inference, extending previous approaches to multivariate and mixed data.

## Key findings

- Crack reliably infers causal direction with high accuracy.
- Method works well on both univariate and multivariate cause-effect pairs.
- Effective on data with mixed data types.

## Abstract

Given data over the joint distribution of two random variables $X$ and $Y$, we consider the problem of inferring the most likely causal direction between $X$ and $Y$. In particular, we consider the general case where both $X$ and $Y$ may be univariate or multivariate, and of the same or mixed data types. We take an information theoretic approach, based on Kolmogorov complexity, from which it follows that first describing the data over cause and then that of effect given cause is shorter than the reverse direction.   The ideal score is not computable, but can be approximated through the Minimum Description Length (MDL) principle. Based on MDL, we propose two scores, one for when both $X$ and $Y$ are of the same single data type, and one for when they are mixed-type. We model dependencies between $X$ and $Y$ using classification and regression trees. As inferring the optimal model is NP-hard, we propose Crack, a fast greedy algorithm to determine the most likely causal direction directly from the data.   Empirical evaluation on a wide range of data shows that Crack reliably, and with high accuracy, infers the correct causal direction on both univariate and multivariate cause-effect pairs over both single and mixed-type data.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1702.06385/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/1702.06385/full.md

## References

34 references — full list in the complete paper: https://tomesphere.com/paper/1702.06385/full.md

---
Source: https://tomesphere.com/paper/1702.06385