Loading paper
A Self-Attentive Meta-Optimizer with Group-Adaptive Learning Rates and Weight Decay | Tomesphere