Loading paper
Using a one dimensional parabolic model of the full-batch loss to estimate learning rates during training | Tomesphere