Loading paper
Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks | Tomesphere