Loading paper
Few-Bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction | Tomesphere