IncomeSCM: From tabular data set to time-series simulator and causal estimation benchmark
Fredrik D. Johansson

TL;DR
IncomeSCM introduces a flexible method to transform real-world observational data into complex, sequential causal models for benchmarking causal effect estimators, addressing limitations of existing simulators.
Contribution
The paper presents a novel approach and software for converting observational data into challenging causal simulation tasks, enhancing benchmarking of causal estimators.
Findings
Effect estimates vary significantly across methods.
Modeling factual outcomes is similar across estimators.
Benchmark tasks reveal differences in causal estimation quality.
Abstract
Evaluating observational estimators of causal effects demands information that is rarely available: unconfounded interventions and outcomes from the population of interest, created either by randomization or adjustment. As a result, it is customary to fall back on simulators when creating benchmark tasks. Simulators offer great control but are often too simplistic to make challenging tasks, either because they are hand-designed and lack the nuances of real-world data, or because they are fit to observational data without structural constraints. In this work, we propose a general, repeatable strategy for turning observational data into sequential structural causal models and challenging estimation tasks by following two simple principles: 1) fitting real-world data where possible, and 2) creating complexity by composing simple, hand-designed mechanisms. We implement these ideas in a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Causal Inference Techniques · Bayesian Modeling and Causal Inference · Decision-Making and Behavioral Economics
MethodsSparse Evolutionary Training
