Deep Transfer $Q$-Learning for Offline Non-Stationary Reinforcement   Learning

Jinhang Chai; Elynn Chen; Jianqing Fan

arXiv:2501.04870·stat.ML·April 15, 2025·2 cites

Deep Transfer $Q$-Learning for Offline Non-Stationary Reinforcement Learning

Jinhang Chai, Elynn Chen, Jianqing Fan

PDF

Open Access

TL;DR

This paper introduces a novel transfer deep Q-learning method for non-stationary reinforcement learning, leveraging neural networks and a re-weighted sampling strategy to improve decision-making across dynamic, diverse populations.

Contribution

It develops a new transfer learning framework for non-stationary RL with neural networks, including a re-weighted sampling procedure and theoretical guarantees, addressing limitations of naive sample pooling.

Findings

01

Outperforms naive pooling in non-stationary RL scenarios

02

Theoretically guarantees transferability with neural networks

03

Validated on synthetic and real datasets

Abstract

In dynamic decision-making scenarios across business and healthcare, leveraging sample trajectories from diverse populations can significantly enhance reinforcement learning (RL) performance for specific target populations, especially when sample sizes are limited. While existing transfer learning methods primarily focus on linear regression settings, they lack direct applicability to reinforcement learning algorithms. This paper pioneers the study of transfer learning for dynamic decision scenarios modeled by non-stationary finite-horizon Markov decision processes, utilizing neural networks as powerful function approximators and backward inductive learning. We demonstrate that naive sample pooling strategies, effective in regression settings, fail in Markov decision processes.To address this challenge, we introduce a novel ``re-weighted targeting procedure'' to construct ``transferable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and ELM

MethodsLinear Regression · Focus