Enhancing Stability of Physics-Informed Neural Network Training Through Saddle-Point Reformulation
Dmitry Bylinkin, Mikhail Aleksandrov, Savelii Chezhegov, Aleksandr Beznosikov

TL;DR
This paper introduces a saddle-point reformulation of PINN training to improve stability, backed by theoretical analysis and extensive experiments showing superior performance over existing methods.
Contribution
It presents a novel saddle-point reformulation of PINNs that enhances training stability and effectiveness, supported by theoretical foundations and comprehensive experiments.
Findings
Outperforms state-of-the-art PINN training methods
Provides theoretical analysis of saddle-point reformulation
Demonstrates improved stability across various tasks
Abstract
Physics-informed neural networks (PINNs) have gained prominence in recent years and are now effectively used in a number of applications. However, their performance remains unstable due to the complex landscape of the loss function. To address this issue, we reformulate PINN training as a nonconvex-strongly concave saddle-point problem. After establishing the theoretical foundation for this approach, we conduct an extensive experimental study, evaluating its effectiveness across various tasks and architectures. Our results demonstrate that the proposed method outperforms the current state-of-the-art techniques.
Peer Reviews
Decision·ICLR 2026 Poster
1. High novelty. The paper reformulates PINN into a new optimization framework that mitigates gradient-conflict issues. 2. Clear and rigorous. The problem setup and method are explained clearly, and the paper provides rich theory, such as complexity bounds, that demonstrates the strength of the approach. 3. Strong experiments. The evaluation across 22 PDEs is comprehensive and shows the stability of the proposed method.
Although this paper have already compared to many baseline optimizers. It would better to include some SOTA method like [1] and [2], especially [2] also experimented on Burgers equation and achieves a much lower error. [1] Rathore, Pratik, et al. "Challenges in training pinns: A loss landscape perspective." arXiv preprint arXiv:2402.01868 (2024). [2] Kiyani, Elham, et al. "Which Optimizer Works Best for Physics-Informed Neural Networks and Kolmogorov-Arnold Networks?." arXiv preprint arXiv:25
1. **clear motivation**: The work addresses a well-known bottleneck in PINN training: gradient imbalance between boundary and residual terms. The motivation for stabilizing the training through adaptive loss weighting is well articulated and relevant. 2. **Novel regularization for strong concavity**: Introducing a divergence-based regularization (via Bregman divergence) ensures strong concavity in the inner maximization problem, which theoretically stabilizes the optimization dynamics. 3. **s
1. **Overall poor presentation including unlear definitions**: for example, in Equation (1), quantities such as $M_r$, $M$, and the role of $S$ and $\hat pi$ are not clearly defined. 2. **Potentially incorrect criticism of prior work**: I think the claim that Liu & Wnag (2021) used $\pi \in \mathbb R^d$ and suffered instability due to the unconstrained weight space may be inaccurate. The dimension of $\pi$ in Liu & Wang (2021) is also the number of losses. 3. **Justification for reformulatio
1. Reformulating PINN training as a nonconvex–strongly concave saddle-point problem and introducing the AdaptiveBGDA optimizer based on Bregman divergence is innovative. The method offers a fresh perspective distinct from traditional gradient-based techniques. 2. Theoretical analyses are rigorous, with proven convergence guarantees. Experimental results across multiple PDE benchmarks show consistent improvements over state-of-the-art methods. 3. The paper is well-structured, with clear explan
1. The theoretical analysis (Theorem 1 and Lemma 2) establishes convergence under idealized assumptions, but these are not explicitly connected to observed empirical behaviors. The discussion lacks explanation of how theoretical stability guarantees translate to improved convergence in practice. → Improvement: Add a section explicitly mapping theoretical assumptions (e.g., strong concavity, bounded divergence) to empirical observations in experiments. 2. Although the paper includes a link to a
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
