E-TCAV: Formalizing Penultimate Proxies for Efficient Concept Based Interpretability
Hasib Aslam, Muhammad Ali Chattha, Muhammad Taha Mukhtar, Muhammad Imran Malik, Andreas Dengel, Sheraz Ahmed

TL;DR
E-TCAV introduces an efficient framework for concept-based interpretability in neural networks, reducing computational costs by leveraging layer agreements and the penultimate layer as a proxy.
Contribution
The paper proposes E-TCAV, a novel method that approximates TCAV scores efficiently by analyzing layer agreements and using the penultimate layer for faster computation.
Findings
Final block layers strongly agree with penultimate layer in TCAV scores
Variance in TCAV scores is influenced by the choice of latent classifier
E-TCAV achieves linear speed-ups relative to network size and sample count
Abstract
TCAV (Testing with Concept Activation Vectors) is an interpretability method that assesses the alignment between the internal representations of a trained neural network and human-understandable, high-level concepts. Though effective, TCAV suffers from significant computational overhead, inter-layer disagreement of TCAV scores, and statistical instability. This work takes a step toward addressing these challenges by introducing E-TCAV, a framework for efficient approximation of TCAV scores, which is based on extensive investigation into three key aspects of the TCAV methodology: 1) the effect of latent classifiers on the stability of TCAV scores, 2) the inter-layer agreement of TCAV scores, and 3) the use of the penultimate layer as a fast proxy for earlier layers for TCAV computation. To ensure a solid foundation for E-TCAV, we conduct extensive evaluations across four different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
