Understanding and Diagnosing Deep Reinforcement Learning

Ezgi Korkmaz

arXiv:2406.16979·cs.LG·June 26, 2024

Understanding and Diagnosing Deep Reinforcement Learning

Ezgi Korkmaz

PDF

Open Access

TL;DR

This paper introduces a systematic method to analyze the stability and sensitivities of deep neural policies in reinforcement learning, revealing how training techniques influence decision boundary robustness and policy reliability.

Contribution

It provides a theoretically grounded approach to identify unstable decision boundary directions and compares the effects of robust versus standard training in deep reinforcement learning.

Findings

01

Robust training leads to disjoint unstable directions with larger oscillations.

02

The method effectively identifies correlated instability directions.

03

Sample shifts significantly alter sensitive directions in policy landscapes.

Abstract

Deep neural policies have recently been installed in a diverse range of settings, from biotechnology to automated financial systems. However, the utilization of deep neural networks to approximate the value function leads to concerns on the decision boundary stability, in particular, with regard to the sensitivity of policy decision making to indiscernible, non-robust features due to highly non-convex and complex deep neural manifolds. These concerns constitute an obstruction to understanding the reasoning made by deep neural policies, and their foundational limitations. Hence, it is crucial to develop techniques that aim to understand the sensitivities in the learnt representations of neural network policies. To achieve this we introduce a theoretically founded method that provides a systematic analysis of the unstable directions in the deep neural policy decision boundary across both…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInnovation Diffusion and Forecasting

MethodsSparse Evolutionary Training