An Approach to Solving the Abstraction and Reasoning Corpus (ARC) Challenge
Tan John Chong Min

TL;DR
This paper explores using GPT-4 with prompt engineering to address the ARC challenge, demonstrating partial success and proposing future multi-agent, memory, and image interpretation enhancements.
Contribution
It introduces a novel approach leveraging large language models with prompt engineering to tackle the ARC challenge, highlighting potential scalability and multi-modal integration.
Findings
GPT-4 solves 2 out of 4 small ARC challenges
Prompt tweaks improve problem-solving capability
Scaling with multi-agent systems and image tools may solve most ARC tasks
Abstract
We utilise the power of Large Language Models (LLMs), in particular GPT4, to be prompt engineered into performing an arbitrary task. Here, we give the model some human priors via text, along with some typical procedures for solving the ARC tasks, and ask it to generate the i) broad description of the input-output relation, ii) detailed steps of the input-output mapping, iii) use the detailed steps to perform manipulation on the test input and derive the test output. The current GPT3.5/GPT4 prompt solves 2 out of 4 tested small ARC challenges (those with small grids of 8x8 and below). With tweaks to the prompt to make it more specific for the use case, it can solve more. We posit that when scaled to a multi-agent system with usage of past memory and equipped with an image interpretation tool via Visual Question Answering, we may actually be able to solve the majority of the ARC challenge
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsTest
