Beyond Behavior: Why AI Evaluation Needs a Cognitive Revolution

Amir Konigsberg

arXiv:2604.05631·cs.AI·April 28, 2026

Beyond Behavior: Why AI Evaluation Needs a Cognitive Revolution

Amir Konigsberg

PDF

TL;DR

This paper advocates for a cognitive revolution in AI evaluation, emphasizing the need to move beyond behavioral tests to understand internal processes and mechanisms of intelligent systems.

Contribution

It highlights the limitations of behavioral evaluation in AI and proposes an epistemological shift towards understanding internal system processes.

Findings

01

Behavioral evaluation constrains questions about internal mechanisms.

02

A cognitive revolution in AI evaluation is necessary.

03

Current methods overlook differences in computational processes.

Abstract

In 1950, Alan Turing proposed replacing the question "Can machines think?" with a behavioral test: if a machine's outputs are indistinguishable from those of a thinking being, the question of whether it truly thinks can be set aside. This paper argues that Turing's move was not only a pragmatic simplification but also an epistemological commitment, a decision about what kind of evidence counts as relevant to intelligence attribution, and that this commitment has quietly constrained AI research for seven decades. We trace how Turing's behavioral epistemology became embedded in the field's evaluative infrastructure, rendering unaskable a class of questions about process, mechanism, and internal organization that cognitive psychology, neuroscience, and related disciplines learned to ask. We draw a structural parallel to the behaviorist-to-cognitivist transition in psychology: just as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.