Transferring Visual Explainability of Self-Explaining Models to Prediction-Only Models without Additional Training

Yuya Yoshikawa; Ryotaro Shimizu; Takahiro Kawashima; Yuki Saito

arXiv:2507.04380·cs.CV·February 3, 2026

Transferring Visual Explainability of Self-Explaining Models to Prediction-Only Models without Additional Training

Yuya Yoshikawa, Ryotaro Shimizu, Takahiro Kawashima, Yuki Saito

PDF

Open Access

TL;DR

This paper introduces a method to transfer visual explanation capabilities from self-explaining models to prediction-only models across domains, enhancing interpretability without retraining or significant accuracy loss.

Contribution

It presents a task arithmetic framework enabling explanation transfer to prediction-only models without additional training, expanding interpretability in existing models.

Findings

01

Transfer of explanation is successful between related domains.

02

Explanation quality improves in target domain.

03

Classification accuracy remains largely unaffected.

Abstract

In image classification scenarios where both prediction and explanation efficiency are required, self-explaining models that perform both tasks in a single inference are effective. However, for users who already have prediction-only models, training a new self-explaining model from scratch imposes significant costs in terms of both labeling and computation. This study proposes a method to transfer the visual explanation capability of self-explaining models learned in a source domain to prediction-only models in a target domain based on a task arithmetic framework. Our self-explaining model comprises an architecture that extends Vision Transformer-based prediction-only models, enabling the proposed method to endow explanation capability to many trained prediction-only models without additional training. Experiments on various image classification datasets demonstrate that, except for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Data Visualization and Analytics