Shadow Datasets, New challenging datasets for Causal Representation   Learning

Jiageng Zhu; Hanchen Xie; Jianhua Wu; Jiazhi Li; Mahyar Khayatkhoei,; Mohamed E. Hussein; Wael AbdAlmageed

arXiv:2308.05707·cs.LG·August 15, 2023

Shadow Datasets, New challenging datasets for Causal Representation Learning

Jiageng Zhu, Hanchen Xie, Jianhua Wu, Jiazhi Li, Mahyar Khayatkhoei,, Mohamed E. Hussein, Wael AbdAlmageed

PDF

Open Access 1 Repo

TL;DR

This paper introduces two new challenging datasets with complex causal graphs for evaluating causal representation learning, addressing limitations of existing datasets in complexity and distribution alignment.

Contribution

The paper presents two novel datasets with more diverse factors and complex causal structures, and modifies existing datasets for better distribution alignment in CRL evaluation.

Findings

01

New datasets with larger number of generative factors

02

More sophisticated causal graphs

03

Improved dataset alignment with real distributions

Abstract

Discovering causal relations among semantic factors is an emergent topic in representation learning. Most causal representation learning (CRL) methods are fully supervised, which is impractical due to costly labeling. To resolve this restriction, weakly supervised CRL methods were introduced. To evaluate CRL performance, four existing datasets, Pendulum, Flow, CelebA(BEARD) and CelebA(SMILE), are utilized. However, existing CRL datasets are limited to simple graphs with few generative factors. Thus we propose two new datasets with a larger number of diverse generative factors and more sophisticated causal graphs. In addition, current real datasets, CelebA(BEARD) and CelebA(SMILE), the originally proposed causal graphs are not aligned with the dataset distributions. Thus, we propose modifications to them.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Jiagengzhu/Shadow-dataset-for-crl
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Graph Neural Networks · Bayesian Modeling and Causal Inference