The Impact of Concept Explanations and Interventions on Human-Machine Collaboration
Jack Furby, Dan Cunnington, Dave Braines, Alun Preece

TL;DR
This paper investigates how Concept Bottleneck Models (CBMs) affect human understanding and collaboration with AI, revealing improved interpretability but limited impact on task accuracy in human-AI teams.
Contribution
First human studies on CBMs in collaborative settings, showing increased interpretability and alignment but no significant accuracy gains.
Findings
CBMs enhance interpretability over standard DNNs.
Increased interpretability does not significantly improve task accuracy.
Misalignment between human and model decisions can reduce effectiveness.
Abstract
Deep Neural Networks (DNNs) are often considered black boxes due to their opaque decision-making processes. To reduce their opacity Concept Models (CMs), such as Concept Bottleneck Models (CBMs), were introduced to predict human-defined concepts as an intermediate step before predicting task labels. This enhances the interpretability of DNNs. In a human-machine setting greater interpretability enables humans to improve their understanding and build trust in a DNN. In the introduction of CBMs, the models demonstrated increased task accuracy as incorrect concept predictions were replaced with their ground truth values, known as intervening on the concept predictions. In a collaborative setting, if the model task accuracy improves from interventions, trust in a model and the human-machine task accuracy may increase. However, the result showing an increase in model task accuracy was…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI · Data Visualization and Analytics
