A Method of Selective Attention for Reservoir Based Agents
Kevin McKee

TL;DR
This paper introduces a high-dimensional masking module for deep reinforcement learning that significantly accelerates training by selectively suppressing irrelevant input dimensions, outperforming traditional methods like layer normalization.
Contribution
The paper proposes a novel high-dimensional masking approach that improves training speed of reservoir-based agents compared to existing input normalization techniques.
Findings
High-dimensional mask yields four-fold training speedup over no suppression.
Two-fold speedup compared to layer normalization.
Surprising effectiveness of additional parameters in input masking.
Abstract
Training of deep reinforcement learning agents is slowed considerably by the presence of input dimensions that do not usefully condition the reward function. Existing modules such as layer normalization can be trained with weight decay to act as a form of selective attention, i.e. an input mask, that shrinks the scale of unnecessary inputs, which in turn accelerates training of the policy. However, we find a surprising result that adding numerous parameters to the computation of the input mask results in much faster training. A simple, high dimensional masking module is compared with layer normalization and a model without any input suppression. The high dimensional mask resulted in a four-fold speedup in training over the null hypothesis and a two-fold speedup in training over the layer normalization method.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
