Loading paper
Layer-wise Pruning of Transformer Attention Heads for Efficient Language Modeling | Tomesphere