Neural Predictor-Corrector: Solving Homotopy Problems with Reinforcement Learning

Jiayao Mai; Bangyan Liao; Zhenjun Zhao; Yingping Zeng; Haoang Li; Javier Civera; Tailin Wu; Yi Zhou; Peidong Liu

arXiv:2602.03086·cs.LG·February 4, 2026

Neural Predictor-Corrector: Solving Homotopy Problems with Reinforcement Learning

Jiayao Mai, Bangyan Liao, Zhenjun Zhao, Yingping Zeng, Haoang Li, Javier Civera, Tailin Wu, Yi Zhou, Peidong Liu

PDF

Open Access 3 Reviews

TL;DR

This paper introduces Neural Predictor-Corrector (NPC), a reinforcement learning-based neural framework that automates and improves the efficiency and stability of homotopy problem solvers across various domains.

Contribution

It unifies homotopy problem solving under a neural, reinforcement learning-based framework, replacing heuristics with learned policies for better generalization and performance.

Findings

01

NPC outperforms classical solvers in efficiency.

02

NPC demonstrates superior stability across tasks.

03

Effective generalization to unseen problem instances.

Abstract

The Homotopy paradigm, a general principle for solving challenging problems, appears across diverse domains such as robust optimization, global optimization, polynomial root-finding, and sampling. Practical solvers for these problems typically follow a predictor-corrector (PC) structure, but rely on hand-crafted heuristics for step sizes and iteration termination, which are often suboptimal and task-specific. To address this, we unify these problems under a single framework, which enables the design of a general neural solver. Building on this unified view, we propose Neural Predictor-Corrector (NPC), which replaces hand-crafted heuristics with automatically learned policies. NPC formulates policy selection as a sequential decision-making problem and leverages reinforcement learning to automatically discover efficient strategies. To further enhance generalization, we introduce an…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 4Confidence 4

Strengths

1. Unified Framework for Diverse Homotopy Tasks: The paper identifies and formalizes the common PC structure underlying homotopy problems in optimization, root-finding, and sampling—an insight that helps consolidate fragmented research in these domains and highlights potential generalizability across tasks. 2. Empirical Validation Across Domains: The authors conduct comprehensive experiments on four representative homotopy tasks (Graduated Non-Convexity, Gaussian Homotopy, Homotopy Continuation,

Weaknesses

1. Limited Novelty in RL for Optimization/Sampling: The core premise of applying RL to improve optimization or sampling workflows is not new. As noted in the paper’s related work, prior studies (e.g., Li, 2019; Belder et al., 2023; Ye et al., 2025) have already explored RL for adaptive parameter tuning, optimizer design, and schedule prediction in similar problem spaces. The paper does not sufficiently distinguish NPC from these existing RL-driven optimization/sampling frameworks beyond its focu

Reviewer 02Rating 6Confidence 3

Strengths

- The paper provides a nice unifying perspective that was at least new to me. - The method seems to improve over baselines on the reported problems. For some of the domains they evaluate on (GNC, root finding) I am not sure how well chosen the baselines are but that is rather on me. - The method is pretty simple, and thus seems to be easy to reimplement. I am surprised that almost no hyperparameters must be changed from the stablebaselines defaults. - Overall the paper is pretty clear and well w

Weaknesses

### Method - The paper presents a very interesting and simple idea: use (reinforcement) learning to improve optimization and sampling methods. While I am not aware of any paper that discusses this under a single umbrella of homotopy problems, I have seen works that use learning to propose sampling steps, eg [1, 2, 3, 4, 5]. Some of these methods are mentioned in the related work section, yet I think it should become more clear that the objective of the paper is a unifying perspective. ### Wordi

Reviewer 03Rating 6Confidence 3

Strengths

1. This application of reinforcement learning seems quite novel to me. I appreciate that PPO is able to be used "out of the box" to solve the problem. 2. The problems chosen for the paper are quite diverse, which shows the applicability of the framework. For example, I could see the application to sampling / Langevin dynamics could eventually tie into generative modeling. 3. The computational requirements for this method are quite low (lines 314-315), making the research more accessi

Weaknesses

My main concern is that it seems the experiments were only conducted over one trial, as I could not find any mention of repetitions, and no error bars are reported in the tables. It is advisable to conduct multiple trials of each algorithm and use statistical testing to check that differences between algorithms are significant. I have also listed several questions in the section below; these are minor points that I think would be useful to address in the updated paper.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Machine Learning and Data Classification · Model Reduction and Neural Networks