Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off
Mateo Espinosa Zarlenga, Pietro Barbiero, Gabriele Ciravegna, Giuseppe, Marra, Francesco Giannini, Michelangelo Diligenti, Zohreh Shams, Frederic, Precioso, Stefano Melacci, Adrian Weller, Pietro Lio, Mateja Jamnik

TL;DR
This paper introduces Concept Embedding Models that improve trustworthiness and interpretability in AI systems by learning high-dimensional, meaningful concept representations, balancing accuracy, explanations, and interventions even with limited supervision.
Contribution
The paper proposes a new family of concept bottleneck models that learn interpretable high-dimensional concept embeddings, overcoming limitations of existing models in accuracy and robustness.
Findings
Achieve better or comparable accuracy to standard neural models without concepts
Capture meaningful semantic concepts beyond ground truth labels
Enable effective test-time concept interventions that improve accuracy
Abstract
Deploying AI-powered systems requires trustworthy models supporting effective human interactions, going beyond raw prediction accuracy. Concept bottleneck models promote trustworthiness by conditioning classification tasks on an intermediate level of human-like concepts. This enables human interventions which can correct mispredicted concepts to improve the model's performance. However, existing concept bottleneck models are unable to find optimal compromises between high task accuracy, robust concept-based explanations, and effective interventions on concepts -- particularly in real-world conditions where complete and accurate concept supervisions are scarce. To address this, we propose Concept Embedding Models, a novel family of concept bottleneck models which goes beyond the current accuracy-vs-interpretability trade-off by learning interpretable high-dimensional concept…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsData Stream Mining Techniques · Machine Learning in Healthcare · Explainable Artificial Intelligence (XAI)
MethodsTest
