Loading paper
Universal scaling laws in the gradient descent training of neural networks | Tomesphere