Loading paper
Transient learning dynamics drive escape from sharp valleys in Stochastic Gradient Descent | Tomesphere