Deep Q-learning: a robust control approach

Balazs Varga; Balazs Kulcsar; Morteza Haghir Chehreghani

arXiv:2201.08610·cs.LG·November 8, 2022·1 cites

Deep Q-learning: a robust control approach

Balazs Varga, Balazs Kulcsar, Morteza Haghir Chehreghani

PDF

Open Access 1 Repo

TL;DR

This paper reinterprets deep Q-learning through a control theory lens, using robust control techniques to analyze and improve its learning stability and convergence without relying on traditional RL heuristics.

Contribution

It introduces a control-oriented framework for deep Q-learning, employing robust controllers like H2 and Hinf to enhance stability and convergence, replacing target networks and replay buffers.

Findings

01

Hinf controllers slightly outperform Double deep Q-learning in simulations.

02

The control-based approach offers greater transparency and theoretical grounding.

03

Learning stability can be improved using robust control techniques.

Abstract

In this paper, we place deep Q-learning into a control-oriented perspective and study its learning dynamics with well-established techniques from robust control. We formulate an uncertain linear time-invariant model by means of the neural tangent kernel to describe learning. We show the instability of learning and analyze the agent's behavior in frequency-domain. Then, we ensure convergence via robust controllers acting as dynamical rewards in the loss function. We synthesize three controllers: state-feedback gain scheduling H2, dynamic Hinf, and constant gain Hinf controllers. Setting up the learning agent with a control-oriented tuning methodology is more transparent and has well-established literature compared to the heuristics in reinforcement learning. In addition, our approach does not use a target network and randomized replay memory. The role of the target network is overtaken…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bva-bme/Controlled_DQN
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsQ-Learning