ConvexBench: Can LLMs Recognize Convex Functions?
Yepeng Liu, Yu Huang, Yu-Xiang Wang, Yingbin Liang, Yuheng Bu

TL;DR
ConvexBench is a new benchmark designed to evaluate whether Large Language Models can recognize convex functions, revealing significant reasoning limitations at high compositional depths and proposing an agentic divide-and-conquer approach to improve performance.
Contribution
The paper introduces ConvexBench, a scalable benchmark for testing LLMs' ability to identify convexity in deep compositions, and proposes an agentic framework to address reasoning failures.
Findings
LLMs' performance drops sharply with increasing depth in convexity recognition.
The proposed divide-and-conquer framework significantly improves reasoning accuracy at large depths.
Models exhibit parsing failure and lazy reasoning as primary failure modes.
Abstract
Convex analysis is a modern branch of mathematics with many applications. As Large Language Models (LLMs) start to automate research-level math and sciences, it is important for LLMs to demonstrate the ability to understand and reason with convexity. We introduce \cb, a scalable and mechanically verifiable benchmark for testing \textit{whether LLMs can identify the convexity of a symbolic objective under deep functional composition.} Experiments on frontier LLMs reveal a sharp compositional reasoning gap: performance degrades rapidly with increasing depth, dropping from an F1-score of at depth to approximately at depth . Inspection of models' reasoning traces indicates two failure modes: \textit{parsing failure} and \textit{lazy reasoning}. To address these limitations, we propose an agentic divide-and-conquer framework that (i) offloads parsing to an external tool…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Natural Language Processing Techniques · Topic Modeling
