A Matter of Interest: Understanding Interestingness of Math Problems in Humans and Language Models

Shubhra Mishra; Yuka Machino; Gabriel Poesia; Albert Jiang; Joy Hsu; Adrian Weller; Challenger Mishra; David Broman; Joshua B. Tenenbaum; Mateja Jamnik; Cedegao E. Zhang; Katherine M. Collins

arXiv:2511.08548·cs.AI·November 12, 2025

A Matter of Interest: Understanding Interestingness of Math Problems in Humans and Language Models

Shubhra Mishra, Yuka Machino, Gabriel Poesia, Albert Jiang, Joy Hsu, Adrian Weller, Challenger Mishra, David Broman, Joshua B. Tenenbaum, Mateja Jamnik, Cedegao E. Zhang, Katherine M. Collins

PDF

Open Access

TL;DR

This paper investigates how well language models understand human judgments of interestingness and difficulty in math problems, revealing both similarities and notable differences in their assessments.

Contribution

It provides empirical analysis of the alignment between LLMs and humans in evaluating mathematical interestingness and difficulty, highlighting current limitations.

Findings

01

LLMs broadly agree with human notions of interestingness

02

LLMs do not fully capture the distribution of human judgments

03

Weak correlation between LLMs and human rationales for interestingness

Abstract

The evolution of mathematics has been guided in part by interestingness. From researchers choosing which problems to tackle next, to students deciding which ones to engage with, people's choices are often guided by judgments about how interesting or challenging problems are likely to be. As AI systems, such as LLMs, increasingly participate in mathematics with people -- whether for advanced research or education -- it becomes important to understand how well their judgments align with human ones. Our work examines this alignment through two empirical studies of human and LLM assessment of mathematical interestingness and difficulty, spanning a range of mathematical experience. We study two groups: participants from a crowdsourcing platform and International Math Olympiad competitors. We show that while many LLMs appear to broadly agree with human notions of interestingness, they mostly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIntelligent Tutoring Systems and Adaptive Learning · Psychological and Educational Research Studies · Ferroelectric and Negative Capacitance Devices