Loading paper
Rethinking Attention: Polynomial Alternatives to Softmax in Transformers | Tomesphere