Loading paper
Exponential convergence rates for Batch Normalization: The power of length-direction decoupling in non-convex optimization | Tomesphere