How good is my story? Towards quantitative metrics for evaluating   LLM-generated XAI narratives

Timour Ichmoukhamedov; James Hinns; David Martens

arXiv:2412.10220·cs.CL·December 16, 2024

How good is my story? Towards quantitative metrics for evaluating LLM-generated XAI narratives

Timour Ichmoukhamedov, James Hinns, David Martens

PDF

1 Repo

TL;DR

This paper proposes a framework with automated metrics to evaluate LLM-generated explanations for tabular data, addressing the need for objective assessment methods in XAI narratives and highlighting challenges like hallucinations.

Contribution

It introduces a novel framework and metrics for quantitatively evaluating LLM-generated XAI narratives without human surveys.

Findings

01

Automated metrics can effectively compare LLMs in generating explanations.

02

The approach reveals challenges such as hallucinations in LLM explanations.

03

Metrics help identify differences across datasets and prompt types.

Abstract

A rapidly developing application of LLMs in XAI is to convert quantitative explanations such as SHAP into user-friendly narratives to explain the decisions made by smaller prediction models. Evaluating the narratives without relying on human preference studies or surveys is becoming increasingly important in this field. In this work we propose a framework and explore several automated metrics to evaluate LLM-generated narratives for explanations of tabular classification tasks. We apply our approach to compare several state-of-the-art LLMs across different datasets and prompt types. As a demonstration of their utility, these metrics allow us to identify new challenges related to LLM hallucinations for XAI narratives.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

admantwerp/shapnarrative-metrics
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsShapley Additive Explanations