Does GPT Really Get It? A Hierarchical Scale to Quantify Human vs AI's Understanding of Algorithms
Mirabel Reid, Santosh S. Vempala

TL;DR
This paper introduces a hierarchical framework to evaluate and compare human and AI understanding of algorithms, using experiments with students and GPT models to reveal insights into AI's cognitive capabilities.
Contribution
It proposes a novel hierarchy of understanding levels for algorithms and applies it to assess and compare human and AI comprehension.
Findings
Humans and GPT models show both similarities and differences in understanding algorithms.
The hierarchy provides a structured way to measure AI's progress in understanding complex concepts.
Rigorous criteria help track AI development in cognitive domains.
Abstract
As Large Language Models (LLMs) perform (and sometimes excel at) more and more complex cognitive tasks, a natural question is whether AI really understands. The study of understanding in LLMs is in its infancy, and the community has yet to incorporate well-trodden research in philosophy, psychology, and education. We initiate this, specifically focusing on understanding algorithms, and propose a hierarchy of levels of understanding. We use the hierarchy to design and conduct a study with human subjects (undergraduate and graduate students) as well as large language models (generations of GPT), revealing interesting similarities and differences. We expect that our rigorous criteria will be useful to keep track of AI's progress in such cognitive domains.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Explainable Artificial Intelligence (XAI) · Artificial Intelligence in Healthcare and Education
