Double Machine Learning for Adaptive Causal Representation in High-Dimensional Data
Lynda Aouar, Han Yu

TL;DR
This paper introduces a novel support points sample splitting method for adaptive causal representation learning in high-dimensional data, improving efficiency and accuracy in causal inference through double machine learning techniques.
Contribution
It proposes the support points sample splitting (SPSS) method for better data representation and integrates it with double machine learning for causal inference, demonstrating superior performance over traditional methods.
Findings
Deep learning with SPSS outperforms SVM in efficiency and accuracy.
Hybrid deep learning and super learner methods with SPSS outperform traditional models.
SPSS provides a more representative data split than random splitting.
Abstract
Adaptive causal representation learning from observational data is presented, integrated with an efficient sample splitting technique within the semiparametric estimating equation framework. The support points sample splitting (SPSS), a subsampling method based on energy distance, is employed for efficient double machine learning (DML) in causal inference. The support points are selected and split as optimal representative points of the full raw data in a random sample, in contrast to the traditional random splitting, and providing an optimal sub-representation of the underlying data generating distribution. They offer the best representation of a full big dataset, whereas the unit structural information of the underlying distribution via the traditional random data splitting is most likely not preserved. Three machine learning estimators were adopted for causal inference, support…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Fault Detection and Control Systems · Machine Learning and Data Classification
MethodsSupport Vector Machine
