Have We Reached AGI? Comparing ChatGPT, Claude, and Gemini to Human   Literacy and Education Benchmarks

Mfon Akpan

arXiv:2407.09573·cs.AI·July 16, 2024·3 cites

Have We Reached AGI? Comparing ChatGPT, Claude, and Gemini to Human Literacy and Education Benchmarks

Mfon Akpan

PDF

Open Access

TL;DR

This paper compares the performance of large language models like ChatGPT, Claude, and Gemini to human educational and literacy benchmarks, showing significant progress toward AGI but highlighting the need for broader assessments.

Contribution

It provides a comparative analysis of LLMs against human literacy and education benchmarks, revealing their strengths and limitations in approaching AGI.

Findings

01

LLMs outperform human undergraduate knowledge levels

02

LLMs excel in advanced reading comprehension tasks

03

Broader cognitive assessments are necessary for true AGI

Abstract

Recent advancements in AI, particularly in large language models (LLMs) like ChatGPT, Claude, and Gemini, have prompted questions about their proximity to Artificial General Intelligence (AGI). This study compares LLM performance on educational benchmarks with Americans' average educational attainment and literacy levels, using data from the U.S. Census Bureau and technical reports. Results show that LLMs significantly outperform human benchmarks in tasks such as undergraduate knowledge and advanced reading comprehension, indicating substantial progress toward AGI. However, true AGI requires broader cognitive assessments. The study highlights the implications for AI development, education, and societal impact, emphasizing the need for ongoing research and ethical considerations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Topic Modeling · Machine Learning in Healthcare