Loading paper
Nesterov's method with decreasing learning rate leads to accelerated stochastic gradient descent | Tomesphere