Learning of Population Dynamics: Inverse Optimization Meets JKO Scheme
Mikhail Persiianov, Jiawei Chen, Petr Mokrov, Alexander Tyurin, Evgeny Burnaev, Alexander Korotin

TL;DR
This paper introduces iJKOnet, a novel method combining inverse optimization with the JKO scheme to learn population dynamics from snapshots, offering theoretical guarantees and improved performance without restrictive neural network architectures.
Contribution
The paper presents iJKOnet, an end-to-end adversarial training approach that integrates inverse optimization with the JKO scheme, eliminating the need for input-convex neural networks and providing theoretical guarantees.
Findings
Demonstrates improved performance over prior JKO-based methods
Provides theoretical guarantees for the proposed approach
Does not require restrictive neural network architectures
Abstract
Learning population dynamics involves recovering the underlying process that governs particle evolution, given evolutionary snapshots of samples at discrete time points. Recent methods frame this as an energy minimization problem in probability space and leverage the celebrated JKO scheme for efficient time discretization. In this work, we introduce , an approach that combines the JKO framework with inverse optimization techniques to learn population dynamics. Our method relies on a conventional adversarial training procedure and does not require restrictive architectural choices, e.g., input-convex neural networks. We establish theoretical guarantees for our methodology and demonstrate improved performance over prior JKO-based methods. The code of is available at https://github.com/MuXauJl11110/iJKOnet.
Peer Reviews
Decision·ICLR 2026 Poster
- iJKOnet is different from previous methods that either require potential-only energies or precomputed OT couplings. The formulation of energy functional recovery using inverse optimization within the JKO scheme provides a conceptually clear route to modeling population level dynamics from discrete snapshots, directly leveraging optimal transport geometry. - Theorem 3.1 gives a non-trivial quality guarantee, explicitly bounding the distance between the gradients of learned and ground-truth pote
- The main theoretical guarantee (Theorem 3.1) exclusively addresses potential energy functionals with strongly convex smooth potentials, while the practical method is claimed to work for broader energy functionals (e.g., including interaction and internal terms). This leaves a theoretical gap between what the analysis covers and what is demonstrated empirically. Specifically, the inability to provide quality bounds for non-potential and more general functionals limits the rigor of the claims. (
1. This paper is well written. 2. The figures are well-prepared and greatly facilitate the understanding of the content. 3. The theoretical proofs is provided, which give solid theoretical guarantees of the proposed method. 4. The background section is clearly presented and offers helpful context.
1. The interchange of the min and Σ operators between Equations (10) and (11) lacks justification or analysis. It is unclear under what conditions this exchange is mathematically valid. 2. The method relies on an adversarial training procedure, which is known to introduce instability. It would be important to discuss whether any measures were taken to mitigate this issue. 3. The work builds heavily on existing studies (e.g., Terpin et al., 2024; JKOnet∗). The specific contributions and novelty
The JKO formulation of gradient flow as a proximal optimzation problem is a fundamental tool in probbaility theory. It has been found difficult to use it effectively as a generating tool and and the reformulation of the JKO functional in this paper is interesting and could be promising.
1) The main weakness is that the paper does not fully implement the proposed loss functional. Indeed the the interaction terms and the entropy terms are ignored and the same architecture as in JKONet is used. This reduces the class of PDEs considered to a very small class of linear PDE. The mismatch between the theory which considers general PDEs and the implementation is too great to make the paper convincing. Ignoring the entropy term, which I suspect is very difficult to estimate, is a s
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInnovation Diffusion and Forecasting · Opinion Dynamics and Social Influence
