Concept Gradient: Concept-based Interpretation Without Linear Assumption
Andrew Bai, Chih-Kuan Yeh, Pradeep Ravikumar, Neil Y. C. Lin, Cho-Jui, Hsieh

TL;DR
This paper introduces Concept Gradient, a novel method for concept-based interpretation that extends beyond linear assumptions, enabling more accurate understanding of black-box models through gradient-based analysis of complex concepts.
Contribution
The paper proposes Concept Gradient, a new approach that generalizes concept-based interpretation to non-linear concepts, improving interpretability over traditional linear methods like CAV.
Findings
CG outperforms CAV in toy and real-world datasets
Concept Gradient effectively captures non-linear concept influences
Extension of gradient-based interpretation to complex concept spaces
Abstract
Concept-based interpretations of black-box models are often more intuitive for humans to understand. The most widely adopted approach for concept-based interpretation is Concept Activation Vector (CAV). CAV relies on learning a linear relation between some latent representation of a given model and concepts. The linear separability is usually implicitly assumed but does not hold true in general. In this work, we started from the original intent of concept-based interpretation and proposed Concept Gradient (CG), extending concept-based interpretation beyond linear concept functions. We showed that for a general (potentially non-linear) concept, we can mathematically evaluate how a small change of concept affecting the model's prediction, which leads to an extension of gradient-based interpretation to the concept space. We demonstrated empirically that CG outperforms CAV in both toy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsData Stream Mining Techniques · Explainable Artificial Intelligence (XAI) · Neural Networks and Applications
