ChatGPT and Gemini participated in the Korean College Scholastic Ability Test -- Earth Science I

Seok-Hyun Ga; Chun-Yen Chang

arXiv:2512.15298·cs.AI·December 18, 2025

ChatGPT and Gemini participated in the Korean College Scholastic Ability Test -- Earth Science I

Seok-Hyun Ga, Chun-Yen Chang

PDF

Open Access

TL;DR

This study evaluates the scientific reasoning capabilities and limitations of advanced LLMs like GPT-4o and Gemini in the context of the Korean College Scholastic Ability Test, revealing key perception and reasoning flaws to inform AI-resistant assessment design.

Contribution

The paper provides a detailed analysis of LLMs' performance on a real-world science test, identifying specific cognitive weaknesses and proposing strategies for AI-resistant assessments.

Findings

01

Models struggle with unstructured inputs due to OCR errors.

02

Perception errors dominate, highlighting a perception-cognition gap.

03

Models perform calculations well but fail to grasp underlying concepts.

Abstract

The rapid development of Generative AI is bringing innovative changes to education and assessment. As the prevalence of students utilizing AI for assignments increases, concerns regarding academic integrity and the validity of assessments are growing. This study utilizes the Earth Science I section of the 2025 Korean College Scholastic Ability Test (CSAT) to deeply analyze the multimodal scientific reasoning capabilities and cognitive limitations of state-of-the-art Large Language Models (LLMs), including GPT-4o, Gemini 2.5 Flash, and Gemini 2.5 Pro. Three experimental conditions (full-page input, individual item input, and optimized multimodal input) were designed to evaluate model performance across different data structures. Quantitative results indicated that unstructured inputs led to significant performance degradation due to segmentation and Optical Character Recognition (OCR)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Intelligent Tutoring Systems and Adaptive Learning · Explainable Artificial Intelligence (XAI)