Loading paper
Revisiting the Shape Convention of Transformer Language Models | Tomesphere