MCCE: Missingness-aware Causal Concept Explainer
Jifan Gao, Guanhua Chen

TL;DR
MCCE is a novel framework that accurately estimates causal effects of concepts in machine learning models even when some concepts are missing, improving interpretability despite incomplete data.
Contribution
It introduces a missingness-aware causal explanation method that accounts for unobserved concepts, addressing a key limitation of existing approaches.
Findings
MCCE outperforms state-of-the-art methods in causal effect estimation.
It effectively handles incomplete concept annotations in real-world data.
Provides both local and global explanations of model behavior.
Abstract
Causal concept effect estimation is gaining increasing interest in the field of interpretable machine learning. This general approach explains the behaviors of machine learning models by estimating the causal effect of human-understandable concepts, which represent high-level knowledge more comprehensibly than raw inputs like tokens. However, existing causal concept effect explanation methods assume complete observation of all concepts involved within the dataset, which can fail in practice due to incomplete annotations or missing concept data. We theoretically demonstrate that unobserved concepts can bias the estimation of the causal effects of observed concepts. To address this limitation, we introduce the Missingness-aware Causal Concept Explainer (MCCE), a novel framework specifically designed to estimate causal concept effects when not all concepts are observable. Our framework…
Peer Reviews
Decision·ICLR 2025 Conference Withdrawn Submission
1.This paper addresses an important and previously unexplored gap in concept-based explanations: handling unobserved concepts. 2. This paper is clearly written and easy to follow.
1. This paper makes a very strong assumption: $H$ contains all necessary about all outputs. 2. This paper does not prove that $\hat{\beta} _{ob}$ estimated by the proposed MCCE is an unbiased estimation of $\beta _{ob}^*$. 2. The number of pseudo-concepts ($j$) is a critical hyper-parameter, there's no theoretical justification for choosing this number. 3. This paper only conduct experiments on one test dataset. The authors should at least do experiments on another image dataset.
The paper addresses a practical problem of estimating the causal effect of certain concepts with incomplete concept set.
**Model assumptions**. The paper did not argue well all the model assumptions. (1) Why are unobserved concepts assumed to be orthogonal to the observed ones? I believe the estimation is far more tricky when the observed/unobserved concepts are correlated. (2) Why is the concept-to-output mapping assumed linear? Many people make the linearity assumption, but rationale is needed here nevertheless. (3) L 228-229 "We hypothesize that H contains all necessary information about all concepts, includ
The idea of using concepts to interpret the treatment effect of an intervention is a really nice one: just as doctors dont recall precise details of earlier patients when determining the course of treatment for a new patient, but rather perform diagnostics on the basis of some high level characteristics that patients may share, this approach seems like the right way to think about how to view effects of interventions in practice.
1) The paper is missing references to crucial pieces of work that have previously explored treatment effect estimation using concepts before (see this work by Goyal et al 2019, https://arxiv.org/abs/1907.07165). As such the novelty of the work is limited. While i understand the contribution may be viewed as the fact that this work addresses the issue of missingness of concepts while prior work does not, I feel that in order to address missingness adequately, the work is missing metrics that migh
1.The problem that causal concepts are missing is practically important and worth a study.
1. Theoretical analysis should be strengthened, 2. The empirical justifications are not well presented.
Thank you so much for submitting this work. I enjoyed reading this paper and appreciate the time taken to make it clear and easy to follow. Below are what I believe are this paper’s main strengths: 1. **[Significance] (Major)** The paper's main intent and goal, that of modeling causal effects from both observed and unobserved concepts, is important for key tasks in interpretability, causality, and XAI. As such, I believe this paper’s goals and studies can interest several research communities.
In contrast, I believe the following are some of this work’s limitations: 1. **[Significance] (Critical)** The empirical evaluation is very limited compared to what one would expect from an ICLR paper. This is perfectly ok if the work instead focuses on providing novel theoretical results (e.g., proofs) or insights/unexpected realizations. However, I believe this work in its current state does not provide enough evidence or results (either theoretically or empirically) to entirely convince me t
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Semantic Web and Ontologies · Natural Language Processing Techniques
