ConSim: Measuring Concept-Based Explanations' Effectiveness with Automated Simulatability

Antonin Poch\'e (IRIT); Alon Jacovi; Agustin Martin Picard; Victor Boutin (CERCO UMR5549; ANITI); Fanny Jourdan

arXiv:2501.05855·cs.CL·June 5, 2025

ConSim: Measuring Concept-Based Explanations' Effectiveness with Automated Simulatability

Antonin Poch\'e (IRIT), Alon Jacovi, Agustin Martin Picard, Victor Boutin (CERCO UMR5549, ANITI), Fanny Jourdan

PDF

2 Repos

TL;DR

This paper introduces ConSim, an automated framework using large language models to evaluate the effectiveness of concept-based explanations in AI models by simulating human understanding and communication.

Contribution

We propose a novel evaluation framework for concept explanations that combines concept quality and interpretability using LLMs as simulators, enabling scalable and consistent assessment.

Findings

01

LLMs provide reliable rankings of explanation methods.

02

The framework enables end-to-end evaluation of concept explanations.

03

Our empirical study demonstrates the effectiveness of the proposed approach.

Abstract

Concept-based explanations work by mapping complex model computations to human-understandable concepts. Evaluating such explanations is very difficult, as it includes not only the quality of the induced space of possible concepts but also how effectively the chosen concepts are communicated to users. Existing evaluation metrics often focus solely on the former, neglecting the latter. We introduce an evaluation framework for measuring concept explanations via automated simulatability: a simulator's ability to predict the explained model's outputs based on the provided explanations. This approach accounts for both the concept space and its interpretation in an end-to-end evaluation. Human studies for simulatability are notoriously difficult to enact, particularly at the scale of a wide, comprehensive empirical evaluation (which is the subject of this work). We propose using large language…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsFocus