On Lottery Tickets and Minimal Task Representations in Deep Reinforcement Learning
Marc Aurel Vischer, Robert Tjarko Lange, Henning Sprekeler

TL;DR
This paper investigates how lottery tickets and sparse representations function in deep reinforcement learning, revealing that task-specific masks and input pruning are key to understanding minimal task representations and their robustness.
Contribution
It demonstrates that in RL, sparse agents require more degrees of freedom, and that the lottery ticket effect is mainly due to masks rather than weight initialization, with a new rescaling method proposed.
Findings
Sparse agents in RL need more degrees of freedom than supervised agents.
Lottery ticket masks mainly drive the effect, not weight initialization.
Input masks prune irrelevant dimensions, leading to interpretable task representations.
Abstract
The lottery ticket hypothesis questions the role of overparameterization in supervised deep learning. But how is the performance of winning lottery tickets affected by the distributional shift inherent to reinforcement learning problems? In this work, we address this question by comparing sparse agents who have to address the non-stationarity of the exploration-exploitation problem with supervised agents trained to imitate an expert. We show that feed-forward networks trained with behavioural cloning compared to reinforcement learning can be pruned to higher levels of sparsity without performance degradation. This suggests that in order to solve the RL-specific distributional shift agents require more degrees of freedom. Using a set of carefully designed baseline conditions, we find that the majority of the lottery ticket effect in both learning paradigms can be attributed to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Domain Adaptation and Few-Shot Learning
MethodsPruning
