Now You See Me (CME): Concept-based Model Extraction
Dmitry Kazhdan, Botty Dimanov, Mateja Jamnik, Pietro Li\`o, Adrian, Weller

TL;DR
CME is a framework for extracting and analyzing concept-based models from DNNs, enhancing explainability and improving predictive performance by identifying key concepts.
Contribution
This work introduces CME, a novel framework for concept-based model extraction and analysis of DNNs, with demonstrated improvements in model accuracy.
Findings
CME can analyze concept information learned by DNNs.
It reveals how DNNs utilize concepts for predictions.
Model accuracy improved by over 14% using key concepts.
Abstract
Deep Neural Networks (DNNs) have achieved remarkable performance on a range of tasks. A key step to further empowering DNN-based approaches is improving their explainability. In this work we present CME: a concept-based model extraction framework, used for analysing DNN models via concept-based extracted models. Using two case studies (dSprites, and Caltech UCSD Birds), we demonstrate how CME can be used to (i) analyse the concept information learned by a DNN model (ii) analyse how a DNN uses this concept information when predicting output labels (iii) identify key concept information that can further improve DNN predictive performance (for one of the case studies, we showed how model accuracy can be improved by over 14%, using only 30% of the available concepts).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Time Series Analysis and Forecasting · Data Quality and Management
