Loading paper
Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling | Tomesphere