MM-Agent: LLM as Agents for Real-world Mathematical Modeling Problem
Fan Liu, Zherui Yang, Cancheng Liu, Tianrui Song, Xiaofeng Gao, Hao Liu

TL;DR
This paper introduces MM-Agent, a framework leveraging LLMs to automate real-world mathematical modeling, outperforming baselines and aiding teams in competitions, with a new benchmark dataset for evaluation.
Contribution
The paper formalizes the task of LLM-powered mathematical modeling, introduces MM-Agent framework, and provides MM-Bench benchmark for evaluating such systems.
Findings
MM-Agent outperforms baseline agents by 11.88% on MM-Bench.
MM-Agent achieves near-human performance and wins awards in MCM/ICM 2025.
The approach is cost-effective, requiring only 15 minutes and $0.88 per task.
Abstract
Mathematical modeling is a cornerstone of scientific discovery and engineering practice, enabling the translation of real-world problems into formal systems across domains such as physics, biology, and economics. Unlike mathematical reasoning, which assumes a predefined formulation, modeling requires open-ended problem analysis, abstraction, and principled formalization. While Large Language Models (LLMs) have shown strong reasoning capabilities, they fall short in rigorous model construction, limiting their utility in real-world problem-solving. To this end, we formalize the task of LLM-powered real-world mathematical modeling, where agents must analyze problems, construct domain-appropriate formulations, and generate complete end-to-end solutions. We introduce MM-Bench, a curated benchmark of 111 problems from the Mathematical Contest in Modeling (MCM/ICM), spanning the years 2000 to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel-Driven Software Engineering Techniques · Distributed and Parallel Computing Systems · Scheduling and Optimization Algorithms
