The Turing Test Is More Relevant Than Ever
Avraham Rahimov, Orel Zamler, Amos Azaria

TL;DR
This paper defends the relevance of the Turing Test for evaluating AI by proposing refined testing methods and demonstrating their effectiveness through systematic experiments, emphasizing its continued importance amidst AI advancements.
Contribution
It introduces enhanced versions of the Turing Test, incorporating richer interactions and evaluation criteria, to better assess AI intelligence and distinguish it from human behavior.
Findings
Refined Turing Test versions improve differentiation between AI and humans.
Rich interaction environments increase detection accuracy.
Off-the-shelf LLMs struggle with more robust Turing Test setups.
Abstract
The Turing Test, first proposed by Alan Turing in 1950, has historically served as a benchmark for evaluating artificial intelligence (AI). However, since the release of ELIZA in 1966, and particularly with recent advancements in large language models (LLMs), AI has been claimed to pass the Turing Test. Furthermore, criticism argues that the Turing Test primarily assesses deceptive mimicry rather than genuine intelligence, prompting the continuous emergence of alternative benchmarks. This study argues against discarding the Turing Test, proposing instead using more refined versions of it, for example, by interacting simultaneously with both an AI and human candidate to determine who is who, allowing a longer interaction duration, access to the Internet and other AIs, using experienced people as evaluators, etc. Through systematic experimentation using a web-based platform, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputability, Logic, AI Algorithms
