Loading paper
On the Training Instability of Shuffling SGD with Batch Normalization | Tomesphere