Spreadsheet Modeling Experiments Using GPTs on Small Problem Statements and the Wall Task

Thomas A. Grossman; Yuan Chen; and Sopiko Datuashvili

arXiv:2604.25689·cs.SE·April 29, 2026

Spreadsheet Modeling Experiments Using GPTs on Small Problem Statements and the Wall Task

Thomas A. Grossman, Yuan Chen, and Sopiko Datuashvili

PDF

TL;DR

This study evaluates GPT-based tools for creating spreadsheet models, finding that while promising, current tools are unreliable and require skilled user oversight.

Contribution

It provides a systematic evaluation of GPT extensions for spreadsheet modeling, highlighting key challenges and suggesting directions for future research.

Findings

01

Excel AI can generate structured models but is inconsistent.

02

GPT tools face confidence and workflow challenges.

03

Current GPT-based spreadsheet tools are unreliable for professional use.

Abstract

This paper investigates how GPT-based tools can assist in building reusable analytical spreadsheet models. After a screening, we evaluate five GPT extensions and select Excel AI by pulsrai.com for detailed testing. Through structured experiments on simple problem statements, we assess Excel AI's performance against the ERFR criteria (each input in a cell; cell formulas; no hardwired numbers; labels; accurate). Results show that while Excel AI can produce well-structured models, it is inconsistent and often non-reproducible. We identify two central challenges - "the problem of confidence" and "the problem of workflow" - which highlight the need for skilled users to verify and adapt GPT-generated spreadsheets. Though GPTs show promise for generating draft models that may reduce development time or lower skill requirements, current tools remain unreliable for professional use. We conclude…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.