CEBaB: Estimating the Causal Effects of Real-World Concepts on NLP Model Behavior
Eldar David Abraham, Karel D'Oosterlinck, Amir Feder, Yair Ori Gat,, Atticus Geiger, Christopher Potts, Roi Reichart, Zhengxuan Wu

TL;DR
This paper introduces CEBaB, a benchmark dataset for evaluating concept-based explanations in NLP by analyzing how real-world concepts influence model outputs through causal inference.
Contribution
We propose CEBaB, a novel dataset with annotated counterfactual reviews, enabling systematic assessment of concept-based explanation methods in NLP.
Findings
CEBaB allows for quantitative comparison of explanation methods.
Different explanation techniques vary significantly in effectiveness.
The dataset facilitates understanding of how abstract concepts affect model behavior.
Abstract
The increasing size and complexity of modern ML systems has improved their predictive capabilities but made their behavior harder to explain. Many techniques for model explanation have been developed in response, but we lack clear criteria for assessing these techniques. In this paper, we cast model explanation as the causal inference problem of estimating causal effects of real-world concepts on the output behavior of ML models given actual input data. We introduce CEBaB, a new benchmark dataset for assessing concept-based explanation methods in Natural Language Processing (NLP). CEBaB consists of short restaurant reviews with human-generated counterfactual reviews in which an aspect (food, noise, ambiance, service) of the dining experience was modified. Original and counterfactual reviews are annotated with multiply-validated sentiment ratings at the aspect-level and review-level. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Sentiment Analysis and Opinion Mining
