$\alpha$-TCAV: A Unified Framework for Testing with Concept Activation Vectors
Ekkehard Schnoor, Jawher Said, Malik Tiomoko, Wojciech Samek, Alexander Jung

TL;DR
This paper introduces $oldsymbol{ extalpha}$-TCAV, a unified, probabilistic framework for concept-based explainability in deep learning that addresses statistical instability in existing methods and offers principled tuning guidance.
Contribution
It develops $oldsymbol{ extalpha}$-TCAV, replacing the indicator function with a smooth parameterized function, unifying TCAV variants, and providing theoretical insights and practical recommendations.
Findings
Identifies a flaw in the standard TCAV score related to variance.
Derives distributions for various CAV-based sensitivity scores.
Provides guidance on tuning $oldsymbol{ extalpha}$-TCAV$ and resource allocation.
Abstract
Concept Activation Vectors (CAVs) are a fundamental tool for concept-based explainability in deep learning, yet their practical utility is limited by statistical instability. We analyze the stochastic nature of CAVs and the Testing with CAVs (TCAV) method, deriving the distributions of major CAV classes including PatternCAV, FastCAV, and ridge regression-based CAVs. We then identify a fundamental flaw in the standard TCAV score: its reliance on a discontinuous indicator function induces non-decaying variance in critical regimes. To address this, we introduce -TCAV, a generalized framework that replaces the indicator with a parameterized smooth function, yielding a unified probabilistic formulation that subsumes both TCAV and Multi-TCAV. We characterize the induced distributions of sensitivity scores and different TCAV variants, showing that established state-of-the-art choices…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
