To Err Is AI! Debugging as an Intervention to Facilitate Appropriate Reliance on AI Systems
Gaole He, Abri Bharos, and Ujwal Gadiraju

TL;DR
This study investigates whether debugging AI systems can improve human users' ability to appropriately rely on AI advice, but finds that debugging may decrease reliance due to exposing AI weaknesses early.
Contribution
The paper introduces the idea of using debugging as an intervention to calibrate human trust in AI, and empirically tests its effectiveness in a human-AI collaboration context.
Findings
Debugging intervention did not increase appropriate reliance on AI.
Participants' reliance decreased after debugging, possibly due to early exposure to AI weaknesses.
User confidence and trustworthiness estimates vary with AI performance levels.
Abstract
Powerful predictive AI systems have demonstrated great potential in augmenting human decision making. Recent empirical work has argued that the vision for optimal human-AI collaboration requires 'appropriate reliance' of humans on AI systems. However, accurately estimating the trustworthiness of AI advice at the instance level is quite challenging, especially in the absence of performance feedback pertaining to the AI system. In practice, the performance disparity of machine learning models on out-of-distribution data makes the dataset-specific performance feedback unreliable in human-AI collaboration. Inspired by existing literature on critical thinking and a critical mindset, we propose the use of debugging an AI system as an intervention to foster appropriate reliance. In this paper, we explore whether a critical evaluation of AI performance within a debugging setting can better…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
