Loading paper
Weight Distillation: Transferring the Knowledge in Neural Network Parameters | Tomesphere