Loading paper
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima | Tomesphere