Attri-Net: A Globally and Locally Inherently Interpretable Model for Multi-Label Classification Using Class-Specific Counterfactuals

Susu Sun; Stefano Woerner; Andreas Maier; Lisa M. Koch; Christian F. Baumgartner

arXiv:2406.05477·cs.CV·November 14, 2025

Attri-Net: A Globally and Locally Inherently Interpretable Model for Multi-Label Classification Using Class-Specific Counterfactuals

Susu Sun, Stefano Woerner, Andreas Maier, Lisa M. Koch, Christian F. Baumgartner

PDF

Open Access 1 Repo

TL;DR

Attri-Net is an inherently interpretable multi-label classification model that uses class-specific counterfactual attribution maps and linear classifiers to provide both local and global explanations, aligning with human knowledge without sacrificing accuracy.

Contribution

The paper introduces Attri-Net, a novel inherently interpretable model that generates class-specific attribution maps for global and local explanations in multi-label classification.

Findings

01

High-quality explanations aligned with clinical knowledge

02

Maintains classification performance comparable to non-interpretable models

03

Provides both local and global interpretability mechanisms

Abstract

Interpretability is crucial for machine learning algorithms in high-stakes medical applications. However, high-performing neural networks typically cannot explain their predictions. Post-hoc explanation methods provide a way to understand neural networks but have been shown to suffer from conceptual problems. Moreover, current research largely focuses on providing local explanations for individual samples rather than global explanations for the model itself. In this paper, we propose Attri-Net, an inherently interpretable model for multi-label classification that provides local and global explanations. Attri-Net first counterfactually generates class-specific attribution maps to highlight the disease evidence, then performs classification with logistic regression classifiers based solely on the attribution maps. Local explanations for each prediction can be obtained by interpreting the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ss-sun/Attri-Net-V2
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies · Sentiment Analysis and Opinion Mining

MethodsALIGN · Logistic Regression