Loading paper
The Law of Parsimony in Gradient Descent for Learning Deep Linear Networks | Tomesphere