Loading paper
Adaptive Top-K in SGD for Communication-Efficient Distributed Learning | Tomesphere