Mining Combined Causes in Large Data Sets

Saisai Ma; Jiuyong Li; Lin Liu; Thuc Duy Le

arXiv:1508.07092·cs.AI·October 16, 2015

Mining Combined Causes in Large Data Sets

Saisai Ma, Jiuyong Li, Lin Liu, Thuc Duy Le

PDF

Open Access

TL;DR

This paper introduces a novel, computationally efficient method for discovering combined causes in large observational data sets, addressing the limitations of existing causal discovery techniques.

Contribution

The paper proposes a new approach that efficiently uncovers multi-factor causes without exhaustive search, improving scalability and accuracy in causal discovery.

Findings

01

High-quality causal discoveries achieved

02

Method demonstrates high computational efficiency

03

Effective on both synthetic and real data

Abstract

In recent years, many methods have been developed for detecting causal relationships in observational data. Some of them have the potential to tackle large data sets. However, these methods fail to discover a combined cause, i.e. a multi-factor cause consisting of two or more component variables which individually are not causes. A straightforward approach to uncovering a combined cause is to include both individual and combined variables in the causal discovery using existing methods, but this scheme is computationally infeasible due to the huge number of combined variables. In this paper, we propose a novel approach to address this practical causal discovery problem, i.e. mining combined causes in large data sets. The experiments with both synthetic and real world data sets show that the proposed method can obtain high-quality causal discoveries with a high computational efficiency.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference · Data Mining Algorithms and Applications · Data Quality and Management