Loading paper
White-Box Transformers via Sparse Rate Reduction | Tomesphere