Meta-learning how to Share Credit among Macro-Actions
Ionel-Alexandru Hosu, Traian Rebedea, and Razvan Pascanu

TL;DR
This paper introduces a meta-learned regularization technique that exploits macro-action similarities to improve credit assignment and exploration in reinforcement learning, demonstrated on Atari and StreetFighter II environments.
Contribution
It proposes a novel regularization term based on a meta-learned similarity matrix to enhance exploration by reducing effective action space dimension.
Findings
Significant improvements over Rainbow-DQN in all tested environments.
Macro-action similarity is transferable across related environments.
The approach enhances credit assignment and exploration efficiency.
Abstract
One proposed mechanism to improve exploration in reinforcement learning is through the use of macro-actions. Paradoxically though, in many scenarios the naive addition of macro-actions does not lead to better exploration, but rather the opposite. It has been argued that this was caused by adding non-useful macros and multiple works have focused on mechanisms to discover effectively environment-specific useful macros. In this work, we take a slightly different perspective. We argue that the difficulty stems from the trade-offs between reducing the average number of decisions per episode versus increasing the size of the action space. Namely, one typically treats each potential macro-action as independent and atomic, hence strictly increasing the search space and making typical exploration strategies inefficient. To address this problem we propose a novel regularization term that exploits…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFinTech, Crowdfunding, Digital Finance
