Continuous-time q-Learning for Jump-Diffusion Models under Tsallis Entropy

Lijun Bo; Yijie Huang; Xiang Yu; Tingting Zhang

arXiv:2407.03888·math.OC·February 16, 2026

Continuous-time q-Learning for Jump-Diffusion Models under Tsallis Entropy

Lijun Bo, Yijie Huang, Xiang Yu, Tingting Zhang

PDF

Open Access

TL;DR

This paper develops continuous-time q-learning algorithms for jump-diffusion models using Tsallis entropy, providing explicit policy characterizations and demonstrating their effectiveness in financial and control problems.

Contribution

It introduces novel q-learning algorithms under Tsallis entropy in continuous time, including explicit policy characterization and actor-critic methods for jump-diffusion models.

Findings

01

Optimal policies are explicitly characterized as distributions with compact support.

02

The proposed algorithms perform well in dark pool liquidation and control problems.

03

Tsallis entropy regularization leads to non-Gibbs optimal policies.

Abstract

This paper studies the continuous-time reinforcement learning in jump-diffusion models by featuring the q-learning (the continuous-time counterpart of Q-learning) under Tsallis entropy regularization. Contrary to the Shannon entropy, the general form of Tsallis entropy renders the optimal policy not necessarily a Gibbs measure. Herein, the Lagrange multiplier and KKT condition are needed to ensure that the learned policy is a probability density function. As a consequence, the characterization of the optimal policy using the q-function also involves a Lagrange multiplier. In response, we establish the martingale characterization of the q-function and devise two q-learning algorithms depending on whether the Lagrange multiplier can be derived explicitly or not. In the latter case, we consider different parameterizations of the optimal q-function and the optimal policy, and update them…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Mechanics and Entropy · Fractional Differential Equations Solutions · Model Reduction and Neural Networks

MethodsQ-Learning · Entropy Regularization