More Robots are Coming: Large Multimodal Models (ChatGPT) can Solve   Visually Diverse Images of Parsons Problems

Irene Hou; Owen Man; Sophie Mettille; Sebastian Gutierrez; Kenneth; Angelikas; Stephen MacNeil

arXiv:2311.04926·cs.CL·November 10, 2023·1 cites

More Robots are Coming: Large Multimodal Models (ChatGPT) can Solve Visually Diverse Images of Parsons Problems

Irene Hou, Owen Man, Sophie Mettille, Sebastian Gutierrez, Kenneth, Angelikas, Stephen MacNeil

PDF

Open Access

TL;DR

Large multimodal models like GPT-4V can effectively solve visually diverse Parsons problems, raising questions about their impact on academic integrity and assessment strategies in computing education.

Contribution

This study evaluates the performance of GPT-4V and Bard on visual Parsons problems, highlighting GPT-4V's high success rate and implications for education.

Findings

01

GPT-4V solved 96.7% of visual Parsons problems.

02

Bard solved 69.2% of problems and faced hallucination issues.

03

Visual problems alone may not prevent academic integrity violations.

Abstract

The advent of large language models is reshaping computing education. Recent research has demonstrated that these models can produce better explanations than students, answer multiple-choice questions at or above the class average, and generate code that can pass automated tests in introductory courses. These capabilities have prompted instructors to rapidly adapt their courses and assessment methods to accommodate changes in learning objectives and the potential for academic integrity violations. While some scholars have advocated for the integration of visual problems as a safeguard against the capabilities of language models, new multimodal language models now have vision and language capabilities that may allow them to analyze and solve visual problems. In this paper, we evaluate the performance of two large multimodal models on visual assignments, with a specific focus on Parsons…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Topic Modeling · Text Readability and Simplification

MethodsFocus