Pre-Generating Multi-Difficulty PDE Data for Few-Shot Neural PDE Solvers
Naman Choudhary, Vedant Singh, Ameet Talwalkar, Nicholas Matthew Boffi, Mikhail Khodak, Tanya Marwah

TL;DR
This paper demonstrates that pre-generating multi-difficulty PDE data and combining it effectively can significantly reduce the computational cost of training neural PDE solvers, especially for complex physics problems.
Contribution
It introduces a method for pre-generating PDE data at multiple difficulty levels to improve neural solver training efficiency and accuracy.
Findings
Pre-generating low and medium difficulty data aids learning high-difficulty physics.
Combining data from multiple difficulty levels reduces pre-generation compute by 8.9x.
Principled data curation across difficulty levels enhances neural PDE solver performance.
Abstract
A key aspect of learned partial differential equation (PDE) solvers is that the main cost often comes from generating training data with classical solvers rather than learning the model itself. Another is that there are clear axes of difficulty--e.g., more complex geometries and higher Reynolds numbers--along which problems become (1) harder for classical solvers and thus (2) more likely to benefit from neural speedups. Towards addressing this chicken-and-egg challenge, we study difficulty transfer on 2D incompressible Navier-Stokes, systematically varying task complexity along geometry (number and placement of obstacles), physics (Reynolds number), and their combination. Similar to how it is possible to spend compute to pre-train foundation models and improve their performance on downstream tasks, we find that by classically solving (analogously pre-generating) many low and medium…
Peer Reviews
Decision·Submitted to ICLR 2026
- This paper uses multiple SOTA Neural Operators (CNO, FFNO, and Poseidon variants) to verify the training effectiveness of different data difficulty distributions within the dataset. - The paper conducts extensive experiments across various scenarios, including fixed total sample size, fixed hard sample size, and few-shot downstream tasks to validate its conclusions.
See Questions
1. The paper tackles an important and underexplored problem: how to allocate data-generation effort across different difficulty levels in neural PDE solver training. 2. The study is thorough and methodologically sound, with carefully controlled experiments along geometry, physics, and combined difficulty axes. 3. Results are consistent across multiple model families (CNO, FFNO, Poseidon), reinforcing the robustness of the findings. 4. The proposed idea of difficulty transfer and the notion of
1. The work is entirely empirical and lacks a theoretical explanation or analytical framework for the observed difficulty transfer phenomenon. It does not explain why medium-difficulty data generalize better than easy data toward harder regimes. 2. The experimental scope is somewhat limited. All results are based on 2D incompressible Navier–Stokes simulations. It remains unclear whether the conclusions hold for other PDE families or multi-physics problems. 3. The evaluation focuses mainly on n
- The paper is overall well written, with clear motivation and reasonable structure. - The problem studied is important for efficiently building practical neural PDE solvers. - The experiments clearly shows the increased computational cost with increased difficulty settings.
My major concerns about this paper are regarding its contributions. - From my perspective, it is intuitively accurate and well known that mixing data with different levels of difficulty facilitates modeling training. Some existing papers have already analyzed the transfer learning bahaviour of scientific machine learning foundation models, which show that model trained with a specific set of PDE coefficients can be more data-efficient than models trained from scratch. [1] I think this conclusio
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Advanced Graph Neural Networks · Machine Learning in Materials Science
