Automated Optimization Modeling via a Localizable Error-Driven Perspective
Weiting Liu, Han Wu, Yufei Kuang, Xiongwei Han, Tao Zhong, Jianfeng Feng, Wenlian Lu

TL;DR
This paper introduces MIND, a novel error-driven framework that enhances automated optimization modeling with LLMs by focusing on localized error patterns, leading to improved performance across multiple benchmarks.
Contribution
The paper proposes MIND, a localizable error-driven learning framework that addresses data sparsity issues in automated optimization modeling with LLMs, improving training and post-training effectiveness.
Findings
MIND outperforms existing approaches on six benchmarks.
Localized error patterns enable focused training and refinement.
Dynamic Supervised Fine-Tuning Policy Optimization enhances difficult problem solving.
Abstract
Automated optimization modeling via Large Language Models (LLMs) has emerged as a promising approach to assist complex human decision-making. While post-training has become a pivotal technique to enhance LLMs' capabilities in this domain, its effectiveness is severely constrained by the scarcity and underutilization of high-quality training data. However, through a detailed profiling of error patterns across various problem-response pairs drawn from post-training, we identify two fundamental limitations of existing automated optimization modeling approaches: (L1) the sparsity of error-specific problems and (L2) the sparse rewards associated with difficult problems. We demonstrate that these limitations can result in suboptimal performance in domain-specific post-training for LLMs. To tackle the above two limitations, we propose a novel error-driven learning framework -- namely,…
Peer Reviews
Decision·Submitted to ICLR 2026
- The "error locality" insight and its application to guide both data synthesis and training is an impressive contribution. - The framework is comprehensive and well-engineered. Empirical results are good, showing improvements over strong baselines. This work advances the practicality of LLMs for optimization.
- The "error locality" observation, while compelling, is not sufficiently validated across diverse models and problem types, questioning its generality. - DFPO relies on the critical assumption that teacher-corrected responses align with the base model's distribution, but provides no quantitative evidence to support this. - Key design choices (e.g., reward function weighting) lack ablation studies. The OOD generalization claim, while supported by a new benchmark, is limited by its small scale.
The authors analyze the error patterns in automated optimization modeling and identify the localizable nature of error propagation. Building on this insight, they design an error-driven reverse data synthesis process that focuses on common error types to generate more challenging and informative training data. In addition, the proposed DFPO strategy effectively complements existing reinforcement learning methods by alleviating issues related to sparse rewards and distributional shift, using cont
Although the authors improve both stages of automated optimization modeling, data synthesis and post-training, the connection between the proposed error-driven data synthesis pipeline and the subsequent DFPO training strategy is not clearly articulated. The two components are insufficiently integrated, which weakens the overall methodological coherence. Moreover, several technical aspects lack sufficient detail. For instance, the paper only provides a single example of a common error type (data
1. The paper contributes MIND-Bench, a new, open-sourced benchmark for evaluating generalization. It is curated from textbooks and real-world industry scenarios, providing a valuable out-of-distribution test set for the community. 2. The proposed method achieves strong empirical results. On the same 7B model, MIND outperforms prior state-of-the-art training-based methods (like SIRL and OptMATH) in macro-average performance across six benchmarks. 3. The DFPO algorithm is an interesting and pragma
1. The paper's claim that errors are "localizable" is unsurprising—competent LLMs naturally make partial errors rather than completely wrong formulations. The sole evidence (Figure 1: error ratios from 100 samples) only computes the fraction of erroneous components, providing no analysis of error dependencies or propagation. The critical claim that errors "do not propagate" is never validated. Without showing that errors in one component (e.g., variables) don't cause errors in dependent componen
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Multi-Objective Optimization Algorithms · Machine Learning and Data Classification · Constraint Satisfaction and Optimization
