Sensivity of LLMs' Explanations to the Training Randomness:Context, Class & Task Dependencies
Romain Loncour, J\'er\'emie Bogaert, Fran\c{c}ois-Xavier Standaert

TL;DR
This paper investigates how the explanations of transformer-based language models vary with training randomness, focusing on the influence of context, class, and task, revealing significant dependencies.
Contribution
It provides a systematic analysis of the factors affecting explanation stability in LLMs, highlighting the varying impact of context, class, and task on explanation sensitivity.
Findings
Explanations are least sensitive to context variations.
Class differences moderately affect explanation stability.
Task complexity significantly influences explanation variability.
Abstract
Transformer models are now a cornerstone in natural language processing. Yet, explaining their decisions remains a challenge. It was shown recently that the same model trained on the same data with a different randomness can lead to very different explanations. In this paper, we investigate how the (syntactic) context, the classes to be learned and the tasks influence this explanations' sensitivity to randomness. We show that they all have statistically significant impact: smallest for the (syntactic) context, medium for the classes and largest for the tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Natural Language Processing Techniques
