Loading paper
Local SGD Accelerates Convergence by Exploiting Second Order Information of the Loss Function | Tomesphere