From Feature Importance to Natural Language Explanations Using LLMs with RAG
Sule Tekkesinoglu, Lars Kunze

TL;DR
This paper presents a method using large language models with external knowledge and counterfactual reasoning to generate natural language explanations of model decisions, enhancing interpretability in scene understanding tasks.
Contribution
It introduces traceable question-answering with an external knowledge base and a novel subtractive counterfactual approach for feature importance, integrating social explanation characteristics into LLM responses.
Findings
Generated explanations include social, causal, and contrastive elements.
The approach effectively bridges complex model outputs and natural language explanations.
Demonstrated potential for improved interpretability in AI systems.
Abstract
As machine learning becomes increasingly integral to autonomous decision-making processes involving human interaction, the necessity of comprehending the model's outputs through conversational means increases. Most recently, foundation models are being explored for their potential as post hoc explainers, providing a pathway to elucidate the decision-making mechanisms of predictive models. In this work, we introduce traceable question-answering, leveraging an external knowledge repository to inform the responses of Large Language Models (LLMs) to user queries within a scene understanding task. This knowledge repository comprises contextual details regarding the model's output, containing high-level features, feature importance, and alternative probabilities. We employ subtractive counterfactual reasoning to compute feature importance, a method that entails analysing output variations…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
MethodsHigh-Order Consensuses
