Accelerating Data Generation for Nonlinear temporal PDEs via homologous perturbation in solution space
Lei Liu, Zhenxin Huang, Hong Wang, huanshuo dong, Haiyang Xin, Hongwei Zhao, Bin Li

TL;DR
The paper introduces HOPSS, a novel algorithm that accelerates data generation for training neural operators on nonlinear PDEs by using homologous perturbations, reducing computational time while maintaining data quality.
Contribution
HOPSS is a new data generation method that efficiently produces training datasets for nonlinear PDEs with fewer time steps, lowering computational costs.
Findings
HOPSS reduces data generation time to about 10% of traditional methods.
It maintains comparable accuracy in model training.
Effective on complex equations like Navier-Stokes.
Abstract
Data-driven deep learning methods like neural operators have advanced in solving nonlinear temporal partial differential equations (PDEs). However, these methods require large quantities of solution pairs\u2014the solution functions and right-hand sides (RHS) of the equations. These pairs are typically generated via traditional numerical methods, which need thousands of time steps iterations far more than the dozens required for training, creating heavy computational and temporal overheads. To address these challenges, we propose a novel data generation algorithm, called HOmologous Perturbation in Solution Space (HOPSS), which directly generates training datasets with fewer time steps rather than following the traditional approach of generating large time steps datasets. This algorithm simultaneously accelerates dataset generation and preserves the approximate precision required for…
Peer Reviews
Decision·ICLR 2026 Conference Desk Rejected Submission
The paper’s primary strength lies in addressing a critical bottleneck in scientific machine learning: the cost of generating training data for neural PDE solvers. HOPSS offers a conceptually simple yet powerful approach to drastically accelerate data creation without degrading data quality. Empirical evaluations, especially on the Navier–Stokes equation, demonstrate 10× faster dataset generation while preserving comparable model accuracy. In some configurations, models trained with HOPSS-generat
Despite its promise, several aspects of the method remain underdeveloped. First, HOPSS heavily depends on the quality and diversity of the base solutions used to seed the perturbation process. If the initial base set poorly represents the global solution manifold, the synthesized samples may propagate its bias, reducing generalization to unseen PDE dynamics. The paper would benefit from a clearer strategy or criterion for selecting representative base solutions. Second, the algorithm introduces
1. The authors consider the most significant bottleneck to scaling deep learning-based PDE solvers (NOs): the enormous computational cost and time required to generate large, high-difelity solutions-rhs datasets using traditional numerical solvers. 2. The proposed results show that the HOPSS method achieves a substantial acceleration (e.g., $10 \times$) while maintaining competitive accuracy for nonlinear equations (Burgers, KdV, Navier-Stokes).
1. The mathematical expression of the proposed process is: $$u_{\text{new}} = u_i + \mu \cdot u_j + \xi$$ This methodology assumes that perturbing a solution $u_i$ with another solution $u_j$ yields a physically related state. However, for nonlinear PDEs, where the operator is $\mathcal{A}(u) = f$, the principle of superposition is violated, making it clear that $\mathcal{A}(u_i + u_j) \not\equiv \mathcal{A}(u_i) + \mathcal{A}(u_j)$. Despite this, the entire method hinges on calculating the new
1. Data generation presents a main bottleneck for learning neural PDE solvers. The aim of research is to alleviate this problem, so it is well motivated. 2. The approach authors propose is lightweight and can be used to generate a large number of additional solutions with minimal overhead.
1. The method authors suggest is not completely specified. 2. Numerical experiments are not convincing. I provide more details in the section below.
- The paper studies an important but often overlooked probably on efficient data generation for neural operators. - The method is simple but seemingly effective.
- It is not very clear how the new RHS is computed, especially new non-linear term $u u_x$. Is it just approximate by the finite difference method? - Please provides some error analysis on the non-linear term in Sec 4, and how it is related to the perturbation parameter. - The method is not applicable for many problem where RHS is not part of the input. Minor: please avoid abusing notations such as - - line 166: $O(Nn)$, $N$ was the nonlinear term, $n$ was the time index. - - line 271: $\Delt
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Numerical methods for differential equations · Numerical Methods and Algorithms
