CausalSent: Interpretable Sentiment Classification with RieszNet
Daniel Frees, Martin Pollack

TL;DR
CausalSent introduces a RieszNet-based neural network for interpretable sentiment analysis, accurately estimating causal effects of text features and providing insights into how specific words influence sentiment.
Contribution
The paper develops a novel RieszNet architecture for causal effect estimation in NLP, enhancing interpretability and accuracy over previous methods.
Findings
Effect of the word 'love' increases positive sentiment probability by 2.9%.
CausalSent reduces MAE of effect estimates by 2-3x compared to prior work.
Framework effectively estimates causal effects in semi-synthetic and real-world data.
Abstract
Despite the overwhelming performance improvements offered by recent natural language processing (NLP) models, the decisions made by these models are largely a black box. Towards closing this gap, the field of causal NLP combines causal inference literature with modern NLP models to elucidate causal effects of text features. We replicate and extend Bansal et al's work on regularizing text classifiers to adhere to estimated effects, focusing instead on model interpretability. Specifically, we focus on developing a two-headed RieszNet-based neural network architecture which achieves better treatment effect estimation accuracy. Our framework, CausalSent, accurately predicts treatment effects in semi-synthetic IMDB movie reviews, reducing MAE of effect estimates by 2-3x compared to Bansal et al's MAE on synthetic Civil Comments data. With an ensemble of validated models, we perform an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
