Why Trust in AI May Be Inevitable
Nghi Truong, Phanish Puranam, Ilia Testlin

TL;DR
This paper argues that trust in AI may be inevitable because explanation can be impossible under certain conditions, leading humans to trust AI systems without genuine understanding, especially as AI explanations become superficially convincing.
Contribution
It formalizes explanation as a search process in knowledge networks and demonstrates that explanation failure can occur even with ideal conditions, implying trust may be unavoidable.
Findings
Explanation can fail even with rational, honest actors and perfect communication.
Time constraints can prevent finding explanation paths despite shared knowledge.
Humans may default to trust in AI when explanations are superficial or impossible.
Abstract
In human-AI interactions, explanation is widely seen as necessary for enabling trust in AI systems. We argue that trust, however, may be a pre-requisite because explanation is sometimes impossible. We derive this result from a formalization of explanation as a search process through knowledge networks, where explainers must find paths between shared concepts and the concept to be explained, within finite time. Our model reveals that explanation can fail even under theoretically ideal conditions - when actors are rational, honest, motivated, can communicate perfectly, and possess overlapping knowledge. This is because successful explanation requires not just the existence of shared knowledge but also finding the connection path within time constraints, and it can therefore be rational to cease attempts at explanation before the shared knowledge is discovered. This result has important…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
