Can We Predict Before Executing Machine Learning Agents?

Jingsheng Zheng; Jintian Zhang; Yujie Luo; Yuren Mao; Yunjun Gao; Lun Du; Huajun Chen; Ningyu Zhang

arXiv:2601.05930·cs.CL·April 8, 2026

Can We Predict Before Executing Machine Learning Agents?

Jingsheng Zheng, Jintian Zhang, Yujie Luo, Yuren Mao, Yunjun Gao, Lun Du, Huajun Chen, Ningyu Zhang

PDF

1 Repo 2 Datasets

TL;DR

This paper introduces a framework for predicting the outcomes of machine learning agents before execution, reducing reliance on costly physical trials, and demonstrates its effectiveness with a new agent and dataset.

Contribution

It formalizes the Data-centric Solution Preference task, creates a large comparison corpus, and develops FOREAGENT, a Predict-then-Verify agent that accelerates convergence.

Findings

01

LLMs achieve 61.5% accuracy in predicting outcomes with a Verified Data Analysis Report.

02

FOREAGENT accelerates convergence by 6 times compared to traditional methods.

03

The code and dataset are publicly available at the provided GitHub URL.

Abstract

Autonomous machine learning agents have revolutionized scientific discovery, yet they remain constrained by a Generate-Execute-Feedback paradigm. Previous approaches suffer from a severe Execution Bottleneck, as hypothesis evaluation relies strictly on expensive physical execution. To bypass these physical constraints, we internalize execution priors to substitute costly runtime checks with instantaneous predictive reasoning, drawing inspiration from World Models. In this work, we formalize the task of Data-centric Solution Preference and construct a comprehensive corpus of 18,438 pairwise comparisons. We demonstrate that LLMs exhibit significant predictive capabilities when primed with a Verified Data Analysis Report, achieving 61.5% accuracy and robust confidence calibration. Finally, we instantiate this framework in FOREAGENT, an agent that employs a Predict-then-Verify loop,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zjunlp/predict-before-execute
github

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.