Loading paper
Gradient Descent Converges Linearly to Flatter Minima than Gradient Flow in Shallow Linear Networks | Tomesphere