No Representation, No Trust: Connecting Representation, Collapse, and   Trust Issues in PPO

Skander Moalla; Andrea Miele; Daniil Pyatko; Razvan Pascanu; Caglar; Gulcehre

arXiv:2405.00662·cs.LG·November 21, 2024·1 cites

No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO

Skander Moalla, Andrea Miele, Daniil Pyatko, Razvan Pascanu, Caglar, Gulcehre

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates how representation collapse affects PPO in reinforcement learning, revealing that regularizing representation dynamics with a new auxiliary loss can prevent performance collapse.

Contribution

It introduces Proximal Feature Optimization (PFO), a novel auxiliary loss that mitigates representation collapse and improves PPO stability in RL environments.

Findings

01

Representation rank deterioration correlates with performance collapse.

02

Stronger non-stationarity worsens feature collapse and agent performance.

03

PFO effectively mitigates representation collapse and enhances PPO stability.

Abstract

Reinforcement learning (RL) is inherently rife with non-stationarity since the states and rewards the agent observes during training depend on its changing policy. Therefore, networks in deep RL must be capable of adapting to new observations and fitting new targets. However, previous works have observed that networks trained under non-stationarity exhibit an inability to continue learning, termed loss of plasticity, and eventually a collapse in performance. For off-policy deep value-based RL methods, this phenomenon has been correlated with a decrease in representation rank and the ability to fit random targets, termed capacity loss. Although this correlation has generally been attributed to neural network learning under non-stationarity, the connection to representation dynamics has not been carefully studied in on-policy policy optimization methods. In this work, we empirically study…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

claire-labo/no-representation-no-trust
pytorchOfficial

Videos

No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO· slideslive

Taxonomy

TopicsOutsourcing and Supply Chain Management

MethodsEntropy Regularization · Proximal Policy Optimization