Continuous-time q-learning for Markov regime switching system under Tsallis entropy
Minghui Zhang, Xun Li, Xin Zhang

TL;DR
This paper introduces continuous-time q-learning algorithms for Markov regime-switching systems under Tsallis entropy, addressing limitations of traditional RL methods and demonstrating their effectiveness in portfolio optimization.
Contribution
It develops novel continuous-time q-learning algorithms under Tsallis entropy and applies them to regime-switching portfolio optimization, expanding the scope of RL in complex systems.
Findings
Algorithms perform well in numerical experiments
Effective in regime-switching market portfolio optimization
Addresses limitations of traditional RL with Tsallis entropy
Abstract
This paper studies the continuous-time q-learning (the continuous time counterpart of Q-learing) for Markov switching system under Tsallis entropy regularization. We address the difficulty in traditional RL algorithms where the Tsallis entropy regularization leads to an optimal policy distribution not necessarily a Gibbs measure, which often complicates algorithm design. Furthermore, to address the limited universality of current continuous time regime-switching RL algorithms (often restricted to the EMV framework), this study focuses on continuous-time q-learning for Markov regime-switching systems based on Tsallis entropy, aiming for a more universally applicable continuous-time RL method. We establish the martingale characterization of the q-function under Tsallis entropy for continuous-time Markov regime-switching systems. Based on this, we design two q-learning algorithms,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Age of Information Optimization · Advanced Bandit Algorithms Research
