On the consistent reasoning paradox of intelligence and optimal trust in   AI: The power of 'I don't know'

Alexander Bastounis; Paolo Campodonico; Mihaela van der Schaar; Ben; Adcock; Anders C. Hansen

arXiv:2408.02357·cs.AI·August 6, 2024

On the consistent reasoning paradox of intelligence and optimal trust in AI: The power of 'I don't know'

Alexander Bastounis, Paolo Campodonico, Mihaela van der Schaar, Ben, Adcock, Anders C. Hansen

PDF

Open Access

TL;DR

The paper introduces the Consistent Reasoning Paradox, showing that trustworthy AI must admit ignorance ('I don't know') to handle fallibility and maintain consistency, revealing fundamental limits of AI reasoning and explainability.

Contribution

It formalizes the Consistent Reasoning Paradox, demonstrating that AI must incorporate 'I don't know' to be trustworthy and consistent, highlighting a new concept called the 'I don't know' function.

Findings

01

Consistent reasoning implies AI will hallucinate on some problems.

02

Detecting hallucinations is harder than solving the problems.

03

Trustworthy AI must be able to say 'I don't know' to handle fallibility.

Abstract

We introduce the Consistent Reasoning Paradox (CRP). Consistent reasoning, which lies at the core of human intelligence, is the ability to handle tasks that are equivalent, yet described by different sentences ('Tell me the time!' and 'What is the time?'). The CRP asserts that consistent reasoning implies fallibility -- in particular, human-like intelligence in AI necessarily comes with human-like fallibility. Specifically, it states that there are problems, e.g. in basic arithmetic, where any AI that always answers and strives to mimic human intelligence by reasoning consistently will hallucinate (produce wrong, yet plausible answers) infinitely often. The paradox is that there exists a non-consistently reasoning AI (which therefore cannot be on the level of human intelligence) that will be correct on the same set of problems. The CRP also shows that detecting these hallucinations,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCognitive Science and Mapping

MethodsSparse Evolutionary Training