Loading paper
Crisp Attention: Regularizing Transformers via Structured Sparsity | Tomesphere