Loading paper
Width Provably Matters in Optimization for Deep Linear Neural Networks | Tomesphere