Causal Intersectionality and Dual Form of Gradient Descent for Multimodal Analysis: a Case Study on Hateful Memes
Yosuke Miyanishi, Minh Le Nguyen

TL;DR
This paper explores how causal intersectionality and gradient-based methods can improve understanding and explanation of multimodal models, especially in detecting hateful memes, by framing the task as an Average Treatment Effect estimation.
Contribution
It introduces a novel approach combining causal intersectionality with gradient-based analysis to interpret multimodal models and demonstrates its application in hateful meme detection.
Findings
Hateful meme detection modeled as ATE estimation using intersectionality.
Gradient-based attention scores reveal model-specific behaviors.
LLama-2 can understand intersectional aspects via in-context learning.
Abstract
Amidst the rapid expansion of Machine Learning (ML) and Large Language Models (LLMs), understanding the semantics within their mechanisms is vital. Causal analyses define semantics, while gradient-based methods are essential to eXplainable AI (XAI), interpreting the model's 'black box'. Integrating these, we investigate how a model's mechanisms reveal its causal effect on evidence-based decision-making. Research indicates intersectionality - the combined impact of an individual's demographics - can be framed as an Average Treatment Effect (ATE). This paper demonstrates that hateful meme detection can be viewed as an ATE estimation using intersectionality principles, and summarized gradient-based attention scores highlight distinct behaviors of three Transformer models. We further reveal that LLM Llama-2 can discern the intersectional aspects of the detection through in-context learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Hate Speech and Cyberbullying Detection · Ethics and Social Impacts of AI
MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Byte Pair Encoding · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Multi-Head Attention · Softmax · Dropout
