Loading paper
SGD and Weight Decay Secretly Minimize the Rank of Your Neural Network | Tomesphere