Loading paper
Token Statistics Transformer: Linear-Time Attention via Variational Rate Reduction | Tomesphere