Weakly Coupled Deep Q-Networks

Ibrahim El Shar; Daniel R. Jiang

arXiv:2310.18803·cs.LG·October 31, 2023·1 cites

Weakly Coupled Deep Q-Networks

Ibrahim El Shar, Daniel R. Jiang

PDF

Open Access 1 Video

TL;DR

This paper introduces WCDQN, a reinforcement learning algorithm for weakly coupled Markov decision processes that improves convergence speed by leveraging structural problem properties and subagent cooperation.

Contribution

The paper presents WCDQN, a novel deep reinforcement learning method that efficiently handles structured problems with multiple subproblems by using a single network and subagent cooperation.

Findings

01

Faster convergence than DQN in experiments with up to 10 subproblems.

02

Proven convergence of the tabular version, WCQL, to the optimal value.

03

Effective in high-action and continuous state space settings.

Abstract

We propose weakly coupled deep Q-networks (WCDQN), a novel deep reinforcement learning algorithm that enhances performance in a class of structured problems called weakly coupled Markov decision processes (WCMDP). WCMDPs consist of multiple independent subproblems connected by an action space constraint, which is a structural property that frequently emerges in practice. Despite this appealing structure, WCMDPs quickly become intractable as the number of subproblems grows. WCDQN employs a single network to train multiple DQN "subagents", one for each subproblem, and then combine their solutions to establish an upper bound on the optimal action value. This guides the main DQN agent towards optimality. We show that the tabular version, weakly coupled Q-learning (WCQL), converges almost surely to the optimal action value. Numerical experiments show faster convergence compared to DQN and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Weakly Coupled Deep Q-Networks· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI)

MethodsConvolution · Dense Connections · Q-Learning · Deep Q-Network