A Hypothesis-Driven Framework for the Analysis of Self-Rationalising Models
Marc Braun, Jenny Kunz

TL;DR
This paper introduces a hypothesis-driven statistical framework using Bayesian networks to analyze and compare the decision processes of self-rationalising LLMs, aiming to assess explanation faithfulness and interpretability.
Contribution
It proposes a novel framework that models LLM decision processes with Bayesian networks and compares generated explanations to evaluate their alignment.
Findings
Models did not strongly resemble GPT-3.5 decision processes.
Framework enables systematic comparison of LLM explanations with structured models.
Potential to improve LLM interpretability in future work.
Abstract
The self-rationalising capabilities of LLMs are appealing because the generated explanations can give insights into the plausibility of the predictions. However, how faithful the explanations are to the predictions is questionable, raising the need to explore the patterns behind them further. To this end, we propose a hypothesis-driven statistical framework. We use a Bayesian network to implement a hypothesis about how a task (in our example, natural language inference) is solved, and its internal states are translated into natural language with templates. Those explanations are then compared to LLM-generated free-text explanations using automatic and human evaluations. This allows us to judge how similar the LLM's and the Bayesian network's decision processes are. We demonstrate the usage of our framework with an example hypothesis and two realisations in Bayesian networks. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Systems and Decision Making · Multi-Agent Systems and Negotiation
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Cosine Annealing · Linear Layer · Byte Pair Encoding · Multi-Head Attention · Attention Dropout · Residual Connection
