Recomposing the Reinforcement Learning Building Blocks with Hypernetworks
Shai Keynan, Elad Sarafian, Sarit Kraus

TL;DR
This paper introduces a hypernetwork-based architecture for reinforcement learning building blocks, enhancing gradient estimation and reducing variance, leading to faster learning and better performance across various tasks and algorithms.
Contribution
It proposes a novel hypernetwork approach that explicitly models interactions between input variables in RL and Meta-RL, improving upon standard concatenation methods.
Findings
Improved gradient approximation in actor-critic algorithms.
Reduced variance in learning steps in Meta-RL.
Consistent performance gains across multiple tasks and algorithms.
Abstract
The Reinforcement Learning (RL) building blocks, i.e. Q-functions and policy networks, usually take elements from the cartesian product of two domains as input. In particular, the input of the Q-function is both the state and the action, and in multi-task problems (Meta-RL) the policy can take a state and a context. Standard architectures tend to ignore these variables' underlying interpretations and simply concatenate their features into a single vector. In this work, we argue that this choice may lead to poor gradient estimation in actor-critic algorithms and high variance learning steps in Meta-RL algorithms. To consider the interaction between the input variables, we suggest using a Hypernetwork architecture where a primary network determines the weights of a conditional dynamic network. We show that this approach improves the gradient approximation and reduces the learning step…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Model Reduction and Neural Networks
MethodsHyperNetwork
