Loading paper
ByteComp: Revisiting Gradient Compression in Distributed Training | Tomesphere