How PC-based Methods Err: Towards Better Reporting of Assumption Violations and Small Sample Errors
Sofia Faltenbacher, Jonas Wahl, Rebecca Herman, Jakob Runge

TL;DR
This paper examines how errors in PC-based causal discovery methods propagate and introduces coherency scores to detect assumption violations and small sample errors without ground truth, improving error detection in practical scenarios.
Contribution
It presents a novel approach using coherency scores to identify assumption violations and small sample errors in PC-based causal discovery, without needing ground truth.
Findings
Errors propagate and can become significant in PC-based methods.
Coherency scores effectively detect assumption violations and small sample errors.
Scores are computationally efficient and validated on simulated and real data.
Abstract
Causal discovery methods based on the PC algorithm are proven to be sound if all structural assumptions are fulfilled and all conditional independence tests are correct. This idealized setting is rarely given in real data. In this work, we first analyze how local errors can propagate throughout the output graph of a PC-based method, highlighting how consequential seemingly innocuous errors can become. Next, we introduce coherency scores to find assumption violations and small sample errors in the absence of a ground truth. These scores do not require statistical tests beyond those already executed by the causal discovery algorithm. Errors detected by our approach extend the set of errors that can be detected by comparable existing methods. We place our computationally cheap global error detection and quantification scores as a bridge between computationally expensive global…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Data Quality and Management · Rough Sets and Fuzzy Logic
