A Guide for Practical Use of ADMG Causal Data Augmentation

Audrey Poinsot; Alessandro Leite

arXiv:2304.01237·cs.LG·April 10, 2023·1 cites

A Guide for Practical Use of ADMG Causal Data Augmentation

Audrey Poinsot, Alessandro Leite

PDF

Open Access 1 Repo

TL;DR

This paper evaluates the ADMG causal data augmentation method for tabular data, highlighting its strengths and limitations in small-data regimes and providing insights for effective application.

Contribution

It offers an experimental analysis of ADMG augmentation, clarifying when prior causal knowledge improves data generation and model robustness.

Findings

01

ADMG is model-agnostic and independent of data mechanism.

02

Requires a minimal number of observations, which can be challenging in small-data settings.

03

Propagates outliers, degrading model performance.

Abstract

Data augmentation is essential when applying Machine Learning in small-data regimes. It generates new samples following the observed data distribution while increasing their diversity and variability to help researchers and practitioners improve their models' robustness and, thus, deploy them in the real world. Nevertheless, its usage in tabular data still needs to be improved, as prior knowledge about the underlying data mechanism is seldom considered, limiting the fidelity and diversity of the generated data. Causal data augmentation strategies have been pointed out as a solution to handle these challenges by relying on conditional independence encoded in a causal graph. In this context, this paper experimentally analyzed the ADMG causal augmentation method considering different settings to support researchers and practitioners in understanding under which conditions prior knowledge…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

audreypoinsot/admg_data_augmentation
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference · Distributed Sensor Networks and Detection Algorithms · Advanced Causal Inference Techniques