Learning by Self-Explaining

Wolfgang Stammer; Felix Friedrich; David Steinmann; Manuel Brack,; Hikaru Shindo; Kristian Kersting

arXiv:2309.08395·cs.AI·September 18, 2024·1 cites

Learning by Self-Explaining

Wolfgang Stammer, Felix Friedrich, David Steinmann, Manuel Brack,, Hikaru Shindo, Kristian Kersting

PDF

Open Access 1 Repo

TL;DR

This paper introduces Learning by Self-Explaining (LSX), a novel approach where models improve by generating explanations that are internally validated, leading to better generalization and more faithful explanations in image classification.

Contribution

The paper proposes LSX, integrating self-explanation and internal critique to enhance model learning and explanation quality, a novel approach in explainable AI.

Findings

01

Improved model generalization and robustness.

02

Reduced influence of confounding factors.

03

Generated explanations are more task-relevant and faithful.

Abstract

Much of explainable AI research treats explanations as a means for model inspection. Yet, this neglects findings from human psychology that describe the benefit of self-explanations in an agent's learning process. Motivated by this, we introduce a novel workflow in the context of image classification, termed Learning by Self-Explaining (LSX). LSX utilizes aspects of self-refining AI and human-guided explanatory machine learning. The underlying idea is that a learner model, in addition to optimizing for the original predictive task, is further optimized based on explanatory feedback from an internal critic model. Intuitively, a learner's explanations are considered "useful" if the internal critic can perform the same task given these explanations. We provide an overview of important components of LSX and, based on this, perform extensive experimental evaluations via three different…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ml-research/learning-by-self-explaining
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Machine Learning in Healthcare · Topic Modeling

MethodsBalanced Selection