Loading paper
Quantifying Hyperparameter Transfer and the Importance of Embedding Layer Learning Rate | Tomesphere