SF-DQN: Provable Knowledge Transfer using Successor Feature for Deep   Reinforcement Learning

Shuai Zhang; Heshan Devaka Fernando; Miao Liu; Keerthiram Murugesan,; Songtao Lu; Pin-Yu Chen; Tianyi Chen; Meng Wang

arXiv:2405.15920·cs.LG·September 24, 2024

SF-DQN: Provable Knowledge Transfer using Successor Feature for Deep Reinforcement Learning

Shuai Zhang, Heshan Devaka Fernando, Miao Liu, Keerthiram Murugesan,, Songtao Lu, Pin-Yu Chen, Tianyi Chen, Meng Wang

PDF

Open Access

TL;DR

This paper introduces SF-DQN, a deep reinforcement learning method leveraging successor features and generalized policy improvement, with proven convergence and superior transfer learning performance over traditional methods.

Contribution

It provides the first theoretical convergence guarantees for SF-DQN with GPI in transfer RL, demonstrating improved efficiency and generalization.

Findings

01

SF-DQN with GPI converges faster than deep Q-networks.

02

The method achieves better generalization in transfer RL tasks.

03

Numerical experiments confirm theoretical advantages.

Abstract

This paper studies the transfer reinforcement learning (RL) problem where multiple RL problems have different reward functions but share the same underlying transition dynamics. In this setting, the Q-function of each RL problem (task) can be decomposed into a successor feature (SF) and a reward mapping: the former characterizes the transition dynamics, and the latter characterizes the task-specific reward function. This Q-function decomposition, coupled with a policy improvement operator known as generalized policy improvement (GPI), reduces the sample complexity of finding the optimal Q-function, and thus the SF \& GPI framework exhibits promising empirical performance compared to traditional RL methods like Q-learning. However, its theoretical foundations remain largely unestablished, especially when learning the successor features using deep neural networks (SF-DQN). This paper…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics