Diagnosing Failure Modes of Neural Operators Across Diverse PDE Families

Lennon Shikhman

arXiv:2601.11428·cs.LG·April 28, 2026

Diagnosing Failure Modes of Neural Operators Across Diverse PDE Families

Lennon Shikhman

PDF

TL;DR

This paper introduces a standardized stress-testing framework to evaluate the robustness of neural PDE solvers across diverse PDE families and architectures, highlighting the gap between in-distribution accuracy and robustness.

Contribution

It presents a comprehensive evaluation method for neural PDE solvers under deployment-relevant shifts, revealing architecture and PDE family-dependent failure patterns.

Findings

01

Strong in-distribution accuracy does not predict robustness.

02

Failure patterns depend on architecture and PDE family.

03

Spectral and rollout diagnostics help identify robustness issues.

Abstract

Neural PDE solvers are increasingly used as learned surrogates for families of partial differential equations, where the key machine learning challenge is not only interpolation on a fixed benchmark distribution but generalization under structured shifts in coefficients, boundary conditions, discretization, and rollout horizon. Yet evaluation is still often dominated by in-distribution test error, making robustness difficult to assess. We introduce a standardized stress-testing framework for neural PDE solvers under deployment-relevant shift. We instantiate it on three representative architectures -- Fourier Neural Operators (FNOs), a DeepONet-style model, and convolutional neural operators (CNOs) -- across five qualitatively different PDE families: dispersive, elliptic, multi-scale fluid, financial, and chaotic systems. Across 750 trained models, we measure robustness using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.