Backward Compatibility in Attributive Explanation and Enhanced Model   Training Method

Ryuta Matsuno

arXiv:2408.02298·cs.LG·August 6, 2024

Backward Compatibility in Attributive Explanation and Enhanced Model Training Method

Ryuta Matsuno

PDF

Open Access

TL;DR

This paper introduces BCX, a metric for evaluating explanation consistency after model updates, and BCXR, a training method to improve explanation backward compatibility while maintaining predictive accuracy.

Contribution

The paper proposes BCX as a new quantitative metric for explanation backward compatibility and BCXR as a training method to enhance this compatibility in model updates.

Findings

01

BCXR improves explanation consistency across models.

02

BCXR maintains high predictive performance.

03

BCXR outperforms baseline methods in experiments.

Abstract

Model update is a crucial process in the operation of ML/AI systems. While updating a model generally enhances the average prediction performance, it also significantly impacts the explanations of predictions. In real-world applications, even minor changes in explanations can have detrimental consequences. To tackle this issue, this paper introduces BCX, a quantitative metric that evaluates the backward compatibility of feature attribution explanations between pre- and post-update models. BCX utilizes practical agreement metrics to calculate the average agreement between the explanations of pre- and post-update models, specifically among samples on which both models accurately predict. In addition, we propose BCXR, a BCX-aware model training method by designing surrogate losses which theoretically lower bounds agreement scores. Furthermore, we present a universal variant of BCXR that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling