Can AI Tools Transform Low-Demand Math Tasks? An Evaluation of Task Modification Capabilities

Danielle S. Fox; Brenda L. Robles; Elizabeth DiPietro Brovey; Christian D. Schunn

arXiv:2604.12743·cs.AI·April 15, 2026

Can AI Tools Transform Low-Demand Math Tasks? An Evaluation of Task Modification Capabilities

Danielle S. Fox, Brenda L. Robles, Elizabeth DiPietro Brovey, Christian D. Schunn

PDF

TL;DR

This study evaluates AI tools' ability to modify low-demand math tasks, revealing moderate success and highlighting challenges in reliably upgrading tasks to higher cognitive levels.

Contribution

It provides an empirical assessment of AI tools' effectiveness in task modification, comparing general-purpose and specialized tools in educational contexts.

Findings

01

AI tools upgraded tasks accurately 64% of the time

02

Performance varied from 33% to 88% among tools

03

Specialized tools only slightly outperformed general-purpose tools

Abstract

While recent research has explored AI tools' ability to classify the quality of mathematical tasks (arXiv:2603.03512), little is known about their capacity to increase the quality of existing tasks. This study investigated whether AI tools could successfully upgrade low-cognitive-demand mathematics tasks. Eleven tools were tested, including six broadly available, general-purpose AI tools (e.g., ChatGPT and Claude) and five tools specialized for mathematics teachers (e.g., Khanmigo, coteach.ai). Using the Task Analysis Guide framework (Stein & Smith, 1998), we prompted AI tools to modify two different types of low-demand mathematical tasks. The prompting strategy aimed to represent likely approaches taken by knowledgeable teachers, rather than extensive optimization to find a more effective prompt (i.e., an optimistic typical outcome). On average, AI tools were only moderately…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.