Learning by Competition of Self-Interested Reinforcement Learning Agents

Stephen Chung

arXiv:2010.09770·cs.LG·December 23, 2021

Learning by Competition of Self-Interested Reinforcement Learning Agents

Stephen Chung

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces Weight Maximization, a biologically plausible learning method for neural networks that improves credit assignment and enables training of both continuous and discrete units, demonstrating faster learning than REINFORCE.

Contribution

The paper proposes Weight Maximization, a novel learning rule that replaces global reward signals with local weight change signals, enhancing biological plausibility and training efficiency.

Findings

01

Weight Maximization approximates reward gradients in expectation.

02

Networks trained with Weight Maximization learn faster than REINFORCE.

03

Weight Maximization enables training of discrete-valued units.

Abstract

An artificial neural network can be trained by uniformly broadcasting a reward signal to units that implement a REINFORCE learning rule. Though this presents a biologically plausible alternative to backpropagation in training a network, the high variance associated with it renders it impractical to train deep networks. The high variance arises from the inefficient structural credit assignment since a single reward signal is used to evaluate the collective action of all units. To facilitate structural credit assignment, we propose replacing the reward signal to hidden units with the change in the $L^{2}$ norm of the unit's outgoing weight. As such, each hidden unit in the network is trying to maximize the norm of its outgoing weight instead of the global reward, and thus we call this learning method Weight Maximization. We prove that Weight Maximization is approximately following the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

stephen-chung-mh/weight_max
noneOfficial

Videos

Learning by Competition of Self-Interested Reinforcement Learning Agents· underline

Taxonomy

TopicsReinforcement Learning in Robotics

MethodsREINFORCE