Adaptive Learning via Off-Model Training and Importance Sampling for Fully Non-Markovian Optimal Stochastic Control. Complete version

Dorival Le\~ao; Alberto Ohashi; Simone Scotti; Adolfo M.D da Silva

arXiv:2604.13147·stat.ML·April 16, 2026

Adaptive Learning via Off-Model Training and Importance Sampling for Fully Non-Markovian Optimal Stochastic Control. Complete version

Dorival Le\~ao, Alberto Ohashi, Simone Scotti, Adolfo M.D da Silva

PDF

TL;DR

This paper introduces a Monte Carlo learning framework for non-Markovian stochastic control problems, enabling off-model training and adaptive updates using importance sampling, with theoretical error bounds and numerical validation.

Contribution

It develops explicit training laws and importance weights for non-Markovian systems, allowing reweighting of a fixed dataset for different models and adaptive recalibration without new trajectory generation.

Findings

01

Non-asymptotic error bounds for neural network approximation of dynamic programming.

02

Quantitative estimates separating Monte Carlo error from model risk.

03

Numerical experiments demonstrating off-model training and adaptive importance sampling.

Abstract

This paper studies continuous-time stochastic control problems whose controlled states are fully non-Markovian and depend on unknown model parameters. Such problems arise naturally in path-dependent stochastic differential equations, rough-volatility hedging, and systems driven by fractional Brownian motion. Building on the discrete skeleton approach developed in earlier work, we propose a Monte Carlo learning methodology for the associated embedded backward dynamic programming equation. Our main contribution is twofold. First, we construct explicit dominating training laws and Radon--Nikodym weights for several representative classes of non-Markovian controlled systems. This yields an off-model training architecture in which a fixed synthetic dataset is generated under a reference law, while the dynamic programming operators associated with a target model are recovered by importance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.