Adaptive Learning via Off-Model Training and Importance Sampling for Fully Non-Markovian Optimal Stochastic Control. Complete version
Dorival Le\~ao, Alberto Ohashi, Simone Scotti, Adolfo M.D da Silva

TL;DR
This paper introduces a Monte Carlo learning framework for non-Markovian stochastic control problems, enabling off-model training and adaptive updates using importance sampling, with theoretical error bounds and numerical validation.
Contribution
It develops explicit training laws and importance weights for non-Markovian systems, allowing reweighting of a fixed dataset for different models and adaptive recalibration without new trajectory generation.
Findings
Non-asymptotic error bounds for neural network approximation of dynamic programming.
Quantitative estimates separating Monte Carlo error from model risk.
Numerical experiments demonstrating off-model training and adaptive importance sampling.
Abstract
This paper studies continuous-time stochastic control problems whose controlled states are fully non-Markovian and depend on unknown model parameters. Such problems arise naturally in path-dependent stochastic differential equations, rough-volatility hedging, and systems driven by fractional Brownian motion. Building on the discrete skeleton approach developed in earlier work, we propose a Monte Carlo learning methodology for the associated embedded backward dynamic programming equation. Our main contribution is twofold. First, we construct explicit dominating training laws and Radon--Nikodym weights for several representative classes of non-Markovian controlled systems. This yields an off-model training architecture in which a fixed synthetic dataset is generated under a reference law, while the dynamic programming operators associated with a target model are recovered by importance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
