A Turing Test for Transparency
Felix Biessmann, Viktor Treu

TL;DR
This paper introduces a Turing Test for Transparency to evaluate whether explanations from AI systems can be distinguished from human explanations, aiming to improve trust calibration in human-AI interactions.
Contribution
It proposes a quantitative metric based on a Turing Test for transparency, enabling assessment of how human-like AI explanations are in comparison to human explanations.
Findings
Most participants could not distinguish AI from human explanations.
Explanations often appear human-like, challenging trust calibration.
The method provides a new way to evaluate transparency in XAI.
Abstract
A central goal of explainable artificial intelligence (XAI) is to improve the trust relationship in human-AI interaction. One assumption underlying research in transparent AI systems is that explanations help to better assess predictions of machine learning (ML) models, for instance by enabling humans to identify wrong predictions more efficiently. Recent empirical evidence however shows that explanations can have the opposite effect: When presenting explanations of ML predictions humans often tend to trust ML predictions even when these are wrong. Experimental evidence suggests that this effect can be attributed to how intuitive, or human, an AI or explanation appears. This effect challenges the very goal of XAI and implies that responsible usage of transparent AI methods has to consider the ability of humans to distinguish machine generated from human explanations. Here we propose a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Ethics and Social Impacts of AI
