Adaptive Partitioning and Learning for Stochastic Control of Diffusion Processes

Hanqing Jin; Renyuan Xu; Yanzhao Yang

arXiv:2512.14991·cs.LG·December 18, 2025

Adaptive Partitioning and Learning for Stochastic Control of Diffusion Processes

Hanqing Jin, Renyuan Xu, Yanzhao Yang

PDF

Open Access

TL;DR

This paper introduces an adaptive, model-based reinforcement learning algorithm for controlled diffusion processes with unbounded state spaces, providing theoretical regret bounds and demonstrating effectiveness in high-dimensional financial applications.

Contribution

The paper proposes a novel adaptive partitioning algorithm for reinforcement learning in unbounded diffusion processes, extending theoretical guarantees and practical performance to high-dimensional, continuous domains.

Findings

01

Regret bounds depend on horizon, dimension, and reward growth.

02

Algorithm effectively balances exploration and approximation.

03

Validated on high-dimensional portfolio optimization tasks.

Abstract

We study reinforcement learning for controlled diffusion processes with unbounded continuous state spaces, bounded continuous actions, and polynomially growing rewards: settings that arise naturally in finance, economics, and operations research. To overcome the challenges of continuous and high-dimensional domains, we introduce a model-based algorithm that adaptively partitions the joint state-action space. The algorithm maintains estimators of drift, volatility, and rewards within each partition, refining the discretization whenever estimation bias exceeds statistical confidence. This adaptive scheme balances exploration and approximation, enabling efficient learning in unbounded domains. Our analysis establishes regret bounds that depend on the problem horizon, state dimension, reward growth order, and a newly defined notion of zooming dimension tailored to unbounded diffusion…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Stochastic processes and financial applications · Reinforcement Learning in Robotics