Can OpenAI o1 outperform humans in higher-order cognitive thinking?

Ehsan Latif; Yifan Zhou; Shuchen Guo; Lehong Shi; Yizhu Gao; Matthew; Nyaaba; Arne Bewerdorff; Xiantong Yang; Xiaoming Zhai

arXiv:2412.05753·cs.CY·December 10, 2024·3 cites

Can OpenAI o1 outperform humans in higher-order cognitive thinking?

Ehsan Latif, Yifan Zhou, Shuchen Guo, Lehong Shi, Yizhu Gao, Matthew, Nyaaba, Arne Bewerdorff, Xiantong Yang, Xiaoming Zhai

PDF

Open Access

TL;DR

This study assesses OpenAI's o1-preview model's performance in higher-order cognitive tasks, finding it often surpasses humans in structured thinking domains but has limitations in problem-solving and adaptive reasoning.

Contribution

It provides a comprehensive evaluation of o1-preview's capabilities across multiple cognitive domains, highlighting its strengths and limitations compared to human performance.

Findings

01

o1-preview outperforms humans in critical, systematic, and data literacy tasks.

02

Achieves near-perfect scientific reasoning scores, exceeding human highest scores.

03

Shows limitations in problem-solving and adaptive reasoning tasks.

Abstract

This study evaluates the performance of OpenAI's o1-preview model in higher-order cognitive domains, including critical thinking, systematic thinking, computational thinking, data literacy, creative thinking, logical reasoning, and scientific reasoning. Using established benchmarks, we compared the o1-preview models's performance to human participants from diverse educational levels. o1-preview achieved a mean score of 24.33 on the Ennis-Weir Critical Thinking Essay Test (EWCTET), surpassing undergraduate (13.8) and postgraduate (18.39) participants (z = 1.60 and 0.90, respectively). In systematic thinking, it scored 46.1, SD = 4.12 on the Lake Urmia Vignette, significantly outperforming the human mean (20.08, SD = 8.13, z = 3.20). For data literacy, o1-preview scored 8.60, SD = 0.70 on Merk et al.'s "Use Data" dimension, compared to the human post-test mean of 4.17, SD = 2.02 (z =…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Cognitive Science and Mapping · Computability, Logic, AI Algorithms