Structural Credit Assignment with Coordinated Exploration
Stephen Chung

TL;DR
This paper introduces a biologically plausible, coordinated exploration method for training neural networks that improves learning speed over traditional independent exploration, using Boltzmann machines or recurrent networks.
Contribution
It proposes using Boltzmann machines or recurrent networks for coordinated exploration, removing the negative phase, and demonstrating faster training compared to independent exploration methods.
Findings
Coordinated exploration significantly outperforms independent exploration in training speed.
The negative phase in Boltzmann machine training can be eliminated, simplifying the learning process.
The method surpasses STE backpropagation in training efficiency.
Abstract
A biologically plausible method for training an Artificial Neural Network (ANN) involves treating each unit as a stochastic Reinforcement Learning (RL) agent, thereby considering the network as a team of agents. Consequently, all units can learn via REINFORCE, a local learning rule modulated by a global reward signal, which aligns more closely with biologically observed forms of synaptic plasticity. However, this learning method tends to be slow and does not scale well with the size of the network. This inefficiency arises from two factors impeding effective structural credit assignment: (i) all units independently explore the network, and (ii) a single reward is used to evaluate the actions of all units. Accordingly, methods aimed at improving structural credit assignment can generally be classified into two categories. The first category includes algorithms that enable coordinated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · Neural dynamics and brain function · Ferroelectric and Negative Capacitance Devices
MethodsFocus · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · REINFORCE
