Loading paper
A Convergence Analysis of Nesterov's Accelerated Gradient Method in Training Deep Linear Neural Networks | Tomesphere