Assessing the Creativity of LLMs in Proposing Novel Solutions to   Mathematical Problems

Junyi Ye; Jingyi Gu; Xinyun Zhao; Wenpeng Yin; Guiling Wang

arXiv:2410.18336·cs.CL·October 25, 2024

Assessing the Creativity of LLMs in Proposing Novel Solutions to Mathematical Problems

Junyi Ye, Jingyi Gu, Xinyun Zhao, Wenpeng Yin, Guiling Wang

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates the creative problem-solving abilities of Large Language Models in mathematics, introducing a new benchmark to evaluate their capacity for generating innovative solutions beyond correctness.

Contribution

It presents a novel framework and benchmark, CreativeMath, to assess LLMs' ability to propose creative solutions to diverse mathematical problems.

Findings

01

Gemini-1.5-Pro outperforms other LLMs in creative problem-solving.

02

LLMs show variable capacity for generating novel solutions.

03

The study highlights both strengths and limitations of LLMs in mathematical creativity.

Abstract

The mathematical capabilities of AI systems are complex and multifaceted. Most existing research has predominantly focused on the correctness of AI-generated solutions to mathematical problems. In this work, we argue that beyond producing correct answers, AI systems should also be capable of, or assist humans in, developing novel solutions to mathematical challenges. This study explores the creative potential of Large Language Models (LLMs) in mathematical reasoning, an aspect that has received limited attention in prior research. We introduce a novel framework and benchmark, CreativeMath, which encompasses problems ranging from middle school curricula to Olympic-level competitions, designed to assess LLMs' ability to propose innovative solutions after some known solutions have been provided. Our experiments demonstrate that, while LLMs perform well on standard mathematical tasks, their…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

junyiye/creativemath
noneOfficial

Videos

Assessing the Creativity of LLMs in Proposing Novel Solutions to Mathematical Problems· underline

Taxonomy

TopicsMathematics Education and Pedagogy

MethodsSoftmax · Attention Is All You Need